Paulo Sérgio Boggio · Tanja S. H. Wingenbach · Marília Lira da Silveira Coêlho · William Edgar Comfort · Lucas Murrins Marques · Marcus Vinicius C. Alves  *Editors*

# Social and Affective Neuroscience of Everyday Human Interaction

From Theory to Methodology

## Social and Affective Neuroscience of Everyday Human Interaction

Paulo Sérgio Boggio • Tanja S. H. Wingenbach Marília Lira da Silveira Coêlho William Edgar Comfort Lucas Murrins Marques Marcus Vinicius C. Alves Editors

## Social and Affective Neuroscience of Everyday Human Interaction

From Theory to Methodology

#### *Editors*

Paulo Sérgio Boggio Social and Cognitive Neuroscience Laboratory, Developmental Disorders Program, Center for Health and Biological Sciences Mackenzie Presbyterian University São Paulo, Brazil

Marília Lira da Silveira Coêlho Social and Cognitive Neuroscience Laboratory, Developmental Disorders Program, Center for Health and Biological Sciences Mackenzie Presbyterian University São Paulo, Brazil

Lucas Murrins Marques Instituto de Medicina Fisica e Reabilitacao, Hospital das Clinicas HCFMUSP Faculdade de Medicina, Universidade de Sao Paulo São Paulo, Brazil

Tanja S. H. Wingenbach School of Human Sciences Faculty of Education, Health, and Human Sciences University of Greenwich Greenwich, London, UK

William Edgar Comfort Social and Cognitive Neuroscience Laboratory, Developmental Disorders Program, Center for Health and Biological Sciences Mackenzie Presbyterian University São Paulo, Brazil

Marcus Vinicius C. Alves Faculty of Health Sciences of Trairi Universidade Federal do Rio Grande do Norte Santa Cruz, Brazil

The open access publication of this book has been published with the support of the Swiss National Science Foundation.

ISBN 978-3-031-08650-2 ISBN 978-3-031-08651-9 (eBook) https://doi.org/10.1007/978-3-031-08651-9

© The Editor(s) (if applicable) and The Author(s) 2023. This book is an open access publication.

**Open Access** This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specifc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

## **Foreword: "Social and Affective Neuroscience: Open Questions Worth Trying to Answer"**

Social and affective neuroscience has exploded in visibility and popularity over the past decade, with its own meetings, journals, and even graduate programs. And rightly so: the topic is not only inherently interesting, but of very high clinical relevance, strongly interdisciplinary, and, perhaps most important for the young scientist, full of open unanswered questions. In short, it offers something for everyone. So does this collection of chapters, which weaves a comprehensive path through the feld, surveying theories, basic and clinical research, and methods. Along the way, it raises perhaps the most interesting—and most diffcult questions facing the feld questions that should be required material to ponder for every student and faculty. The very frst set of Chapters already introduces us to these.

A perennial question begins with the words we use. What are the categories or dimensions that can help organize our understanding of social behavior? Many schemes are possible. Perhaps the most general is a dimension of approach or avoidance, but this is too broad since it is not specifc to the social domain. Emotions (the topic of Chap. 1) are more fnely differentiated, but also not uniquely social. Modules for mating or aggression have been well-studied in animals and seem both social and categorical. Chapter 7 tells us about categories of moral judgment. How do we make sense of this variety?

One nagging worry is that the schemes we currently have available to make sense of it are all made up in the minds of scientists. That is, they are derivative to the words and concepts we happen to have for understanding the social world and, as such, are no better than folk psychology. In support of this view are constructivist theories of emotion, which point to the failure of cognitive neuroscience to fnd any specifc dimensions or categories in the brain. Several chapters, especially the methodological ones in the latter part of the book, provide an antidote to this view: one just needs the right measures. Just like studies in animals have certainly found quite specifc circuits for different types of social behavior, so too can we fnd systems for emotions in humans if only we develop the right tools. Nonetheless, it is clear that our current schemes will require revision—a requirement that is important not only theoretically but also for the practical diagnosis of disorders.

The second chapter introduces another long-standing question: To what extent is our social and emotional behavior rapid, modular, and automatic, and to what extent is it under deliberative control? This question is of course a species of the longstanding dual processing view in cognitive psychology. Once again complementing the methods Chapters at the end, this question is addressed with a multimodal approach that amasses evidence from many sources, and that also leaves us with one of the most popular anatomical answers: the amygdala subserves rapid, nonconscious processing. The view is not without its distractors (myself amongst them), but it remains a viable hypothesis and one with patent clinical implications.

While the dual processing view originally focused on controlled versus automatic processing, it has since become associated with a myriad of attributes—cognitive/emotional, slow/fast, conscious/nonconscious, and—the topic of Chap. 3—brain/body. The role of the body, or neural representations of it, has been emphasized at both sensory and motor ends of social cognition, and there is now substantial evidence that embodiment matters. But this fnding raises what is perhaps one of the deepest questions: are some psychological variables literally in the body? What exactly are the commitments to constitutive and causal relations in social and affective neuroscience? Is an emotion or a personality trait in the behavior, in the body, in the brain, or in all of these? My colleagues and I have recently argued that these should be thought of as in the brain, albeit with of course potent causal connections to body, genes, and environment.

This frst batch of most diffcult questions is followed by a set of no less interesting but perhaps more practically oriented questions that motivate several of the chapters in the middle of the book. Mirror neurons illustrate the power of a crossspecies approach, and an analysis of sex differences raises the importance of considering development: clearly, both comparative and developmental approaches need to be well represented in a mature social affective neuroscience. We are also confronted with vexing challenges whose solution may not be an empirical fnding, but rather a practical shift in the feld itself. One ubiquitous challenge, not only in this discipline but in many, is simply imprecise language. Findings from the brain are "involved in," "related to," "underpin," or perhaps even are "important to" various presumed functions. But what exactly does this mean? That they play a causal role? Usually this is not tested. That they are the only mechanisms causing a certain function? Even less so. In the end, understanding human social and affective behavior requires not only multiple methods, comparisons with animal studies, and new theories, but it also requires more rigor in our terminology.

The book concludes with a nice overview of the many methods used in this feld, ranging from ERPs to facial EMG, TMS, fMRI hyperscanning, and others. In many cases, these are illustrated with respect to specifc applications and research questions, ranging from examples of studies in language to tutorials on machine learning. In the end, the reader will have traveled through theory, clinical application, numerous methods, and a lot of specifc case studies of topics, well illustrating the richness of this vibrant feld.

Two chapters on mirror neurons and on the brain's default-mode network tie together the need for strong methods development in tandem with tackling deep conceptual questions and raise perhaps the biggest question for the feld: what exactly is social and affective neuroscience? This question brings us back in a way to the frst question that I raised. Is social affective neuroscience merely neuroscience of a particular topic, just like chair perception neuroscience, or cake tasting neuroscience, or any other arbitrary topic? Or are there systems, specializations of some sort in the brain that show us that this discipline really carves nature at its joints in some way, and is not merely the invention of its practitioners? Quick answers have not stood up to scrutiny. True, the default-mode network is engaged during mind wandering and social cognition—but it is also there to some degree in monkeys and under anesthesia, examples where mind wandering seems unlikely. Specifc structures like the amygdala and ventromedial prefrontal cortex are often brought up—but these are involved in many functions. The way forward in both cases may be to forge methods that can discover more subtle functional network signatures or specifc neuronal subpopulations.

By way of closing on a personal note, this last question, related to the frst, has also been a topic I have debated. Whereas my colleague Lisa Feldman Barrett thinks that emotion categories are constructed in some way, I think they are objectively to be discovered in Nature. Neither view is straightforward to unpack, but our debates have certainly helped me identify some of the diffculties with my own view. I also think this kind of back-and-forth characterizes the atmosphere of social affective neuroscience. We realize that the questions are fun but diffcult, and we acknowledge that any fndings are always preliminary, and often turn out to be wrong. Like much of neuroscience and psychology, the feld has weathered the replication crisis and has put in place more robust practices and analyses. Yet the very complexity and interdisciplinary nature of social and affective neuroscience defy any simple formula for how to do research in it. In the end, understanding human social and affective behavior requires not only multiple methods, comparisons with animal studies, and new theories; it requires a lot of ingenuity and detective work to tell a plausible story. Chapters collected here will give readers a good sense of these issues, and I hope they will motivate them not only to read more but also to question and debate.

California Institute of Technology Pasadena, CA, USA

Ralph Adolphs

## **Preface**

This book on social and affective neuroscience was inspired by an event, the Sao Paulo School of Advanced Science on Social and Affective Neuroscience (SPSAN), which took place in August 2018 over a period of 10 days. The SPSAN was organised by the Social and Cognitive Neuroscience Laboratory (SCNL) of the Mackenzie Presbyterian University in Sao Paulo, Brazil, and funded by the Sao Paulo Research Foundation (FAPESP). The aim of the SPSAN was to deepen the understanding of social and affective neuroscience and learn more about how the current state of research can explain phenomena we experience in our daily lives as social beings. The event constituted a unique opportunity for 100 competitively selected undergraduate and postgraduate students as well as early career researchers (at post-doc level) from all over the world to be part of ten knowledge-enriching days of theoretical and practical learning. National and international leading scientists from the feld of social and affective neuroscience were recruited as speakers and fown in.

The editors of this book would like to thank all speakers that contributed to the SPSAN and all participants who engaged in the event and together made it the success it was. We are grateful to FAPESP for their generous funding, which allowed us to bring together scientists from all over the world. We also thank the Brazilian Academy of Science and the companies 'Rogue', 'Natura', 'Proibras' and 'Ergoneers' for their support. We would like to express thanks to Fanny Lachat who was part of the core team developing the theme that became the SPSAN. The event would not have been possible without the help and hard work of all the volunteers from the SCNL, especially Ruth Lyra Espinosa, Carol Nakao, Beatriz Ribeiro, Leticia Yumi, Patricia Cabral, Graziela Bonato, Fernanda Pantaleão, Carolina Gudin and Camila Valim. Last but not least, we are grateful to the Mackenzie Presbyterian University who provided us with the needed physical space and technical support to host the event.

The SPSAN was an enriching and inspiring event. With the intent to present a state-of-the-art overview of current topics within social and affective neuroscience, the idea of this anthology was born. Speakers from the SPSAN alongside other leading researchers from social and affective neuroscience contributed to the book. With slightly varying areas of expertise, the book provides some answers to the question of how social and affective neuroscience can explain various aspects of human everyday interaction. The book entails current state methods and theory of social and affective neuroscience applying an evidence-based approach. Combining the knowledge on social and affective neuroscience with the methodology to conduct social-affective neuroscientifc research, the book is likewise of interest to researchers, university teachers and laymen with interest in the topics.

São Paulo, Brazil Paulo Sérgio Boggio London, UK Tanja S. H. Wingenbach São Paulo, Brazil Marília Lira da Silveira Coêlho São Paulo, Brazil William Edgar Comfort São Paulo, Brazil Lucas Murrins Marques Santa Cruz, Brazil Marcus Vinicius C. Alves

## **About the Book**

This book seeks to address central aspects for the scientifc understanding of social and affective neuroscience as a whole. The book contains four parts: (I) Affective Neuroscience; (II) Social Neuroscience and Moral Emotions; (III) Clinical Neuroscience; and (IV) Methods Used in Social and Affective Neuroscience.

The frst part, *Affective Neuroscience*, presents the current state of affective neuroscience research. The term 'affective' relates to moods and emotions and their processing, which plays a crucial role in human social interactions. We are constantly presented with our own emotions and moods and those of others. In social interactions, perceived affect processing and the processing of one's own affect constitute ongoing necessities. Affective states guide our attention as well as motivation and thus have an effect on social interactions. The chapters in this part investigate psychological, neural and molecular aspects of affective neuroscience.

The *Social Neuroscience and Moral Emotions* part covers phenomena present in society. Social and moral emotions guide our behaviour towards others, but the magnitude with which individuals experience these emotions varies greatly. The chapters in this part present neurobiological and behavioural processes in relevance to social interactions covering topics such mirror neurons and sex differences in social cognition as well as the development of morality and trust in the realm of social interaction.

The *Clinical Neuroscience* part focuses on disorders/conditions that affect social cognition. As much as neuroscience can be used to explain everyday phenomena in social interactions, neuroscience can explain disorders/conditions of clinical relevance. The investigation of brains of healthy individuals compared to those with clinical diagnoses provides invaluable information on the disorders and the associated symptoms. With some conditions affecting social functioning, atypical brain processes can explain abnormalities in regard to social skills. Neuroscience can further explain emotion regulation and defciency thereof. The chapters in this part stretch from clinical neuroscience in childhood and adolescence to adulthood

The last part covers *Methods Used in Social and Affective Neuroscience. Experts* present state-of-the-art methods to investigate social and affective neuroscience with the typical currently widely applied equipment. That is, brain imaging (MRI, more experienced readers.

## **Contents**

#### **Part I Affective Neuroscience**


#### **Part III Clinical Neuroscience**


## **Contributors**

**Marcus Vinicius C. Alves** Faculty of Health Sciences of Trairi, Universidade Federal do Rio Grande do Norte, Santa Cruz, Brazil

**Edson Amaro Jr.** LIM-44, Departamento de Radiologia, Hospital das Clínicas da Faculdade de Medicina da Universidade de São Paulo, São Paulo, Brazil

**Paulo Rodrigo Bazán** LIM-44, Departamento de Radiologia, Hospital das Clínicas da Faculdade de Medicina da Universidade de São Paulo, São Paulo, Brazil

Hospital Israelita, Albert Einstein, São Paulo, Brazil

**Claudinei E. Biazoli Jr.** Center for Mathematics, Computing, and Cognition, Federal University of ABC, São Bernardo do Campo, Brazil

**Paulo Sérgio Boggio** Social and Cognitive Neuroscience Laboratory, Developmental Disorders Program, Center for Health and Biological Sciences, Mackenzie Presbyterian University, São Paulo, Brazil

**Dimitris Bolis** Independent Max Planck Research Group for Social Neuroscience, Max Planck Institute of Psychiatry, Munich-Schwabing, Germany

International Max Planck Research School for Translational Psychiatry (IMPRS-TP), Munich, Germany

Munich Medical Research School (MMRS), Dekanat der Medizinischen Fakultat, Ludwig-Maximilians- Universitat Munchen, Munich, Germany

**Patrícia Cabral** Social and Cognitive Neuroscience Laboratory, Developmental Disorders Program, Center for Health and Biological Sciences, Mackenzie Presbyterian University, São Paulo, Brazil

**Leonardo Christov-Moore** Brain and Creativity Institute, University of Southern California, Los Angeles, CA, USA

**William Edgar Comfort** Social and Cognitive Neuroscience Laboratory, Developmental Disorders Program, Center for Health and Biological Sciences, Mackenzie Presbyterian University, São Paulo, Brazil

**Thiago da Silva Gusmão Cardoso** Centro Adventista Universitário de São Paulo, São Paulo, Brazil

**Claudia Berlim de Mello** Department of Psychobiology, Universidade Federal de São Paulo, São Paulo, Brazil

**Mariana Rachel Dias da Silva** Tilburg University Cognitive Science and Artifcial Intelligence Department, Tilburg, The Netherlands

**Ana Luísa Freitas** Social and Cognitive Neuroscience Laboratory, Developmental Disorders Program, Center for Health and Biological Sciences, Mackenzie Presbyterian University, São Paulo, Brazil

**Óscar F. Gonçalves** Proaction Lab, CINEICC – Faculty of Psychology and Educational Sciences, University of Coimbra, Coimbra, Portugal

**Marco Iacoboni** Department of Psychiatry and Biobehavioral Sciences, Ahmanson-Lovelace Brain Mapping Center, Brain Research Institute, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA

**Jonas Kaplan** Brain and Creativity Institute, University of Southern California, Los Angeles, CA, USA

**Juha M. Lahnakoski** Forschungszentrum Jülich, Institute of Neurosciences and Medicine (INM), Jülich, Germany

**Paulo Guirro Laurence** Social and Cognitive Neuroscience Laboratory and Developmental Disorders Program, Center for Health and Biological Sciences, Mackenzie Presbyterian University, São Paulo, Brazil

**Katerina Lukasova** Postgraduate Program in Neuroscience and Cognition – PPGNC, Federal University of ABC – UFABC, São Bernardo, Brazil

**Elizeu Coutinho de Macedo** Social and Cognitive Neuroscience Laboratory and Developmental Disorders Program, Center for Health and Biological Sciences, Mackenzie Presbyterian University, São Paulo, Brazil

**Lucas Murrins Marques** Instituto de Medicina Fisica e Reabilitacao, Hospital das Clinicas HCFMUSP, Faculdade de Medicina, Universidade de Sao Paulo, Sao Paulo, Brazil

**Lauri Nummenmaa** Turku PET Centre and Turku University Hospital, Turku, Finland

Department of Psychology, University of Turku, Turku, Finland

**Alice Mado Proverbio** Department of Psychology, University of Milano-Bicocca, Milan, Italy

**Vesa Putkinen** Turku PET Centre and Turku University Hospital, Turku, Finland

**Gabriel Rego** Social and Cognitive Neuroscience Laboratory, Developmental Disorders Program, Center for Health and Biological Sciences, Mackenzie Presbyterian University, São Paulo, Brazil

**João R. Sato** Center for Mathematics, Computing, and Cognition, Federal University of ABC, São Bernardo do Campo, Brazil

**Wataru Sato** Psychological Process Research Team, Guardian Robot Project, RIKEN, Kyoto, Japan

**Leonhard Schilbach** Independent Max Planck Research Group for Social Neuroscience, Max Planck Institute of Psychiatry, Munich-Schwabing, Germany

LVR Klinikum Dusseldorf/Kliniken der Heinrich-Heine-Universitat Dusseldorf, Düsseldorf, Germany

Ludwig-Maximilians-Universitat, Medical Faculty, Munich, Germany

**Kerttu Seppälä** Turku PET Centre and Turku University Hospital, Turku, Finland

**Marilia Lira da Silveira Coelho** Social and Cognitive Neuroscience Laboratory, Developmental Disorders Program, Center for Health and Biological Sciences, Mackenzie Presbyterian University, São Paulo, Brazil

**Lucas R. Trambaiolli** Basic Neuroscience Division, Mclean Hospital – Harvard Medical School, Belmont, MA, USA

**Tanja S. H. Wingenbach** School of Human Sciences, Faculty of Education, Health, and Human Sciences, University of Greenwich, Greenwich, London, UK

**Alberto Zani** School of Psychology, Vita-Salute San Raffaele University, Milan, Italy

## **About the Editors**

**Paulo Sérgio Boggio** holds a Bachelor's degree in Psychology from the University of São Paulo (1998), a professional development certifcate in Neuropsychology from the Neurology unit at USP Medical School, a Master's degree in Experimental Psychology from the University of São Paulo (2004) and a PhD in Psychology (Neuroscience and Behavior) from the University of São Paulo (2007). He leads the Laboratory of Cognitive and Social Neuroscience at the Center for Health and Biological Sciences, Mackenzie Presbyterian University. He is a professor in Developmental Disorders and Psychology at Mackenzie Presbyterian University. In 2011, he was elected an affliate member of the Brazilian Academy of Sciences. He is a research productivity Fellow at CNPq – Level 1C. He has wide experience in the feld of psychology, with an emphasis on neuropsychology and cognitive, social and affective neuroscience. He also has experience with the use of non-invasive techniques of brain stimulation, high-density EEG, eye-tracking and other techniques.

**Tanja S. H. Wingenbach** holds a B.Sc. in Psychology (University of Luxembourg), a M.Sc. in Clinical Psychology (University of Basel), and a PhD in Psychology (University of Bath). From 2017–2019, she held a post-doctoral fellowship at the Social and Cognitive Neuroscience Laboratory, Mackenzie Presbyterian University in Sao Paulo. She was a post-doctoral senior research fellow at the University of Zurich / University Hospital Zurich from 2019–2022, after which she joined the University of Greenwich as a Lecturer in Psychology. Her research falls within emotion sciences and encompasses typical as well as atypical populations.

**Marília Lira da Silveira Coêlho** obtained her Bachelor's degree in Physiotherapy from University Center of Bahia in 2007, holds a Master's in Health and Medicine from Federal University of Bahia (2010) and a PhD in Developmental Disorders from Mackenzie Presbyterian University (2017). She is a professor in the Department of Physiotherapy and researcher at the Social and Cognitive Neuroscience Laboratory at the Center for Health and Biological Sciences, Mackenzie Presbyterian University. Her main research interests are multisensory integration, embodiment and ownership experiences and non-invasive techniques of brain stimulation.

**William Edgar Comfort** holds a BSc in Psychology from Bangor University (2007), an MSc in Neuropsychology from the University of Bristol (2012), with a clinical internship at the Head Injury Therapy Unit (HITU) of Frenchay Hospital, Bristol, and a PhD in Neuroscience and Cognition from the Universidade Federal do ABC (2015). He was previously an associate professor of Educational Psychology at Osasco University (UNIFIEO). He is currently a FAPESP-funded post-doctoral research fellow at the Social and Cognitive Neuroscience Laboratory of Mackenzie Presbyterian University and teaches Neuroscience and Applied Psychology at Mackenzie Presbyterian University. His research interests are in psychophysics, computational modelling, eyetracking, EMG, facial recognition, neural processing of facial expressions and visual perception impairments in schizophrenia.

**Lucas Murrins Marques** obtained his Bachelor in Psychology from Mackenzie Presbyterian University in 2014, holds a Master's degree in Basic Psychology from Minho University (2013) a Master's in Developmental Disorders from Mackenzie Presbyterian University (2016) and a PhD in Developmental Disorders from Mackenzie Presbyterian University (2020). He is currently a post-doctoral research fellow at IMREA from the Faculty of Medicine of University of São Paulo (Grant 2021/05897-5, São Paulo Research Foundation; FAPESP). His main research interests are in brain stimulation, moral judgement, psychophysiology, machine learning, emotion regulation and well-being.

**Marcus Vinicius C. Alves** is a Professor at Universidade Federal do Rio Grande do Norte (UFRN) and has a PhD in Psychobiology at Universidade Federal de São Paulo (Brazil) and experience as a visiting researcher at Memorial University of Newfoundland (Canada) and as a research and development specialist at Université du Luxembourg (Luxembourg). He obtained his Bachelor's degree in Psychology from the Federal University of Bahia in 2011 and holds a Master's in Psychobiology from the Federal University of São Paulo (2013). Alves' research interests include learning, memory, incidental and motivated forgetting, mental effort (processing systems, attention load, cognitive load theory and executive control) and social cognition, using mostly eye tracking and pupillometry.

## **Abbreviations**




## **List of Figures**



Fig. 5.5 Grand-average ERPs recorded in professional basketball players (**a**) and naïve viewers (**b**) in response to correct and incorrect basketball actions at frontal, parietal, and occipital scalp sites. (Taken and redrawn from Proverbio et al., 2012) ........ 73 Fig. 5.6 (Above) Examples of stimuli used for the study on neurons sensory preference (i.e. the face of a conspecifc emitting a vocalization vs. the opening and closing of a disc without any facial stimulus). (Below) Bioelectrical responses displayed by a multisensory cell of the associative auditory cortex of the macaque monkey. Note that the response to the combined voice and face conditions (red line) is far superior than the uni-sensory stimulation (in this case, the response to the incongruous coupling between disk and voice that did not stimulate the cell enough is also drawn as a yellow line). (Adapted from Ghazanfar and Schroeder (2006). Courtesy of the authors) ........... 75 Fig. 5.7 Visual stimulation consisted in the silent presentation of pictures of animals and tools while the auditory stimulation consisted of the blind presentation of their verse or typical sound. The audiovisual stimulation involved the integration between the two modes. Brain images show the BOLD signals of neurometabolic activation obtained by fMRI in the various stimulation conditions. Note that the audiovisual condition activated the multimodal prefrontal regions, as well as the motor and premotor cortices, the posterior region of the STS, and the MTG. (Drawn and modifed by Beauchamp et al. (2004a, b). Courtesy of the authors) ............................................... 77 Fig. 5.8 Some examples of 'sound' (top) and *"*silent' (centre) visual stimuli presented together with other hundreds of stimuli to unaware observers, instructed to detect and respond to infrequent images of cycling races. The analysis of ERP peaks, together with the reconstruction of their intracerebral generators by means of the swLORETA technique, demonstrated the activation of the left medial temporal cortex after only 110 ms from the presentation of the image. The extraction of sound information associated with the use of familiar tools after ~200 ms activated the primary (BA38) and secondary (BA41) auditory cortices. This information is responsible, for example, for auditory hallucinations, which, in this case, refer, in a dim way, to the call of the specifc sound produced by the tool (in the fgure, the sounds produced by the sax or by the infernal chainsaw). (Taken from Proverbio et al. (2011b). Courtesy of the authors) ................... 78 Fig. 5.9 (**a**) Examples of visual stimuli used in the study by Hasegawa et al. (2004) (**b**) Activation of the left temporal region as a function of musical performance in the three groups of participants. (**c**) fMRI activations in response to an exclusively visual stimulation in the brain of professional pianists. (Courtesy of the authors) ................................................................ 79

#### xxviii


#### List of Figures






## **List of Tables**


## **Part I Affective Neuroscience**

## **Chapter 1 Molecular Imaging of the Human Emotion Circuit**

**Lauri Nummenmaa, Kerttu Seppälä, and Vesa Putkinen**

**Abstract** Emotions modulate behavioral priorities via central and peripheral nervous systems. Understanding emotions from the perspective of specifc neurotransmitter systems is critical, because of the central role of affect in multiple psychopathologies and the role of specifc neuroreceptor systems as corresponding drug targets. Here, we provide an integrative overview of molecular imaging studies that have targeted the human emotion circuit at the level of specifc neuroreceptors and transmitters. We focus specifcally on opioid, dopamine, and serotonin systems, given their key role in modulating motivation and emotions, and discuss how they contribute to both healthy and pathological emotions.

**Keywords** Molecular imaging · Human emotions · Dopamine system · Serotonin system · Opioid system

#### **Introduction**

Emotions prepare us for action. They coordinate systemic activation patterns at multiple physiological and behavioral scales to promote survival. Most modern emotion theories consider emotions as modulatory systems interacting with both lower-order systems, such as those involved in homeostasis, as well as higher-order cognitive circuits supporting decision-making. Categorical models of emotions propose that evolution has specifed a set of basic emotions (usually including anger, fear, disgust, happiness, sadness, and surprise but possibly also others) that support specialized survival functions (Cordaro et al., 2018; Cowen & Keltner, 2017; Ekman, 1992; Nummenmaa & Saarimäki, 2017; Panksepp, 1982). These basic

L. Nummenmaa (\*)

Turku PET Centre and Turku University Hospital, Turku, Finland

Department of Psychology, University of Turku, Turku, Finland e-mail: latanu@utu.f

K. Seppälä · V. Putkinen Turku PET Centre and Turku University Hospital, Turku, Finland

**Fig. 1.1** Statistical summary of brain regions involved in emotional processing based on the NeuroSynth database (Yarkoni et al., 2011)

emotions are characterized by discrete neural and physiological substrates, distinctive subjective feelings (such as "I feel happy"), expressions, and a selective functionally dependent neural basis (Kreibig, 2010; Nummenmaa et al., 2014, 2018; Saarimäki et al., 2016; Tracy & Randles, 2011). Much of recent neuroimaging work has aimed at mapping the functional organization of the emotion circuits in the brain using functional magnetic resonance imaging (Hudson et al., 2020; Nummenmaa & Saarimäki, 2017; Wager et al., 2015), and these studies have been successful in delineating the neurobiological architecture of emotions (Fig. 1.1).

Meta-analyses of the BOLD-fMRI data have however yielded inconsistent support for the discrete neural basis of emotions. One proposed explanation for this is the low spatial resolution of BOLD-fMRI coupled with univariate analysis: if specifc neural populations coding different emotions are intermixed within one voxel, their activation differences cannot be revealed by univariate techniques. In line with this view, multivariate pattern recognition studies have consistently provided support for a discrete neural basis of different basic and complex emotions (Kragel et al., 2016; Kragel & Labar, 2015; Putkinen et al., 2021; Saarimäki et al., 2016, 2018). Even though multivariate analysis techniques improve the discriminability and specifcity of data patterns across different classes or conditions (Norman et al., 2006), they cannot resolve one of the main limitations of the BOLD-EPI data—that the signal is unspecifc with respect to the underlying neurotransmitter circuits.

A single voxel in an echo-planar image may contain neurons operating with a multitude of different neurotransmitters, whose net activation is refected in the BOLD signal. Understanding emotions from the perspective of specifc neurotransmitter systems is however critical, because of the central role of affect in multiple psychopathologies and the role of specifc neuroreceptor systems as drug targets. For example, the most commonly assumed working mechanism of antidepressants involves either increased neurotransmission by increasing synaptic neurotransmitter levels (such as norepinephrine or dopamine [DA]) or specifc agonist effects of the targeted receptors. Thus, it is imperative to delineate not just the anatomical but also neuromolecular organization of the emotion circuits in the brain. Here, we provide an overview of the molecular mechanisms of emotions, with specifc focus on *in vivo* imaging of specifc neurotransmitter and neuroreceptor studies in humans. We

**Fig. 1.2** Distribution of type-2 dopamine receptors, μ-opioid receptors, and 5-HT 1A transporters measured using PET radioligands

focus specifcally on opioidergic, dopaminergic, and serotonergic mechanisms, as they can be readily studied *in vivo* in the human brain (Fig. 1.2).

#### **Studying Human Neuroreceptor Systems** *In Vivo*

Most commonly used functional imaging (fMRI) and electromagnetic (MEG / EEG) techniques for recording brain activation do not yield any information regarding the underlying mechanisms of neurotransmission. Because pharmacological microstimulation studies are not feasible in humans, main approaches for studying emotion-related neurotransmission involve different activation, blockade, and depletion studies, as well as nuclear medicine imaging techniques for direct *in vivo* measurements.

#### **Pharmacological Activation and Blockage Studies**

The classical behavioral pharmacological approach involves delivering specifc receptor agonists or antagonists or other pharmacologically active agents into the circulatory system or directly into the target tissue in the case of animal studies. In humans, these studies are diffcult to conduct, because oral or intravenous administration leads to systemic rather than regionally specifc effects, and it has been well established through animal studies that the effects of receptor agonists/antagonists can be regionally highly selective (Berridge & Kringelbach, 2015). One way for overcoming this limitation is to use a pharmacological imaging approach, where functional imaging or electromagnetic recordings are performed during pharmacological treatment versus a placebo condition, which allows identifying the brain regions where the drug action infuences neural responses. However, these regional responses may still be infuenced by system-level effects, and pinpointing the

specifc regions whose pharmacological manipulation leads to altered BOLD signal is diffcult. Furthermore, studies employing potent pharmacological agents such as morphine or dexamphetamine require strict clinical supervision. Finally, pharmacological manipulations may lead to physiological effects that directly confound the BOLD signal, such as respiratory depression caused by opioid agonists (Pattinson, 2008), further complicating their interpretation.

#### **Monoamine Depletion Studies**

A complementary approach to pharmacological activation and blockage studies involves techniques that temporarily lower the functioning of monoamines such as 5-HT, DA, and catecholamine, typically by blocking the synthesis or restricting the intake of amino acid precursors. The three most widely used techniques involve acute tryptophan depletion (ADT) to block 5-HT transporter synthesis by dietary restriction of the 5-HT precursor l-tryptophan. The effect is amplifed by the consumption of a large quantity of other amino acids that compete with tryptophan at the blood–brain barrier (Booij et al., 2003). Phenylalanine/tyrosine depletion (APTD), in turn, targets the dopaminergic/catecholamic systems by restricting the dietary intake of its precursors, phenylalanine and tyrosine. Such techniques result in specifc short-term effects in distinct neurotransmitter systems rather than on general protein metabolism in the brain (Booij et al., 2003); however, the interpretation of these results is complicated due to distinct system-level effects on transmitter synthesis. Nevertheless, these techniques are valuable when investigating the involvement of monoamine system function in specifc mood disorders.

#### **Molecular Imaging with Positron Emission Tomography**

Functional molecular imaging using positron emission tomography (PET) is the current gold standard for *in vivo* molecular imaging in humans. It is based on injecting radiolabeled, biologically active molecules into the circulation. These molecules bind to specifc target sites, and the unstable isotopes subsequently undergo positron emission decay. The radioisotope emits a positron—an antiparticle of an electron which loses kinetic energy as it travels through brain tissue. After a certain degree of deceleration, the positron can interact with an electron, leading to an annihilation event producing two gamma photons (rays) moving in opposite directions. The gamma rays are recorded by the detector units of the PET camera, and on the basis of simultaneously detected gamma rays on the opposite sides of the detector ring, the location of the annihilation event can be computed. This subsequently allows reconstruction of the tracer uptake in the tissue. When combined with measurements of tracer input and output, these raw radioactivity counts can be transformed into biologically meaningful information such as radioligand binding at neuroreceptors.

This technique provides excellent biological resolution due to the potential for developing highly selective radioligands binding to different protein targets and spatial resolution up to a few millimeters. Despite its high sensitivity for *in vivo* biomarker tracing, PET lacks the capability for capturing the underlying tissue morphology at high spatial resolution; as such, this information usually needs to be acquired through separate MR or CT scans. Functional imaging of slow-acting neurotransmission is however possible (Backman et al., 2011; Zubieta et al., 2001), although temporal resolution is limited to tens of minutes for most neurotransmission studies. Modern integrated PET—MRI systems (Judenhofer et al., 2008) also allow for the simultaneous measurement of perfusion with both PET and arterial spin labeled MRI (Heijtel et al., 2014; Zhang et al., 2014), or perfusion with MRI and neuroreceptor occupancy (PET) signifcantly broadening the utility of PET (Sander et al., 2019). Furthermore, joint analysis of PET and structural MR images provide complementary information about the mesoscopic organization of the brain (Manninen et al., 2021). All in all, the PET technique is currently the most accurate and specifc tool available for investigating *in vivo* neurotransmission in humans.

#### **The Dopamine System**

Rewards exert a powerful infuence on our behavior. Both humans and animals are motivated to obtain various rewards ranging from food and sex to social contact, and the pleasurable sensations we experience on receiving the reward further reinforce our motivation to seek and consume the same reward in the future. The monoamine neurotransmitter dopamine (DA) and its receptors D1-D5 have been well-established as playing a key role in motor control and reward-related behavior and pleasure. There are multiple DA pathways in the brain that consist of neuronal projections which synthesize and release DA (Fig. 1.3). The mesolimbic pathway projects from the ventral tegmental area (VTA) to the ventral striatum. This pathway is particularly involved in processing incentive salience, generating pleasure responses and reinforcement learning. The mesocortical pathway projecting from the VTA to the prefrontal cortex is, in turn, more involved in executive functions although it also contributes to reward processing. The nigrostriatal pathway connects substantia nigra to the striatum (putamen and caudate) and contributes critically to motion control. Finally, the tuberoinfundibular pathway connects the hypothalamus and the pituitary gland. Importantly, all the main functions of the dopamine system are also central to reward processing, and it comes as no surprise that dopamine system has been implicated as one of the primary molecular pathways for reward (Wise & Rompre, 1989), and microinjection studies in animals have established that dopamine stimulation of the nucleus modulates incentive motivation (DiFeliceantonio & Berridge, 2016; Peciña & Berridge, 2013).

**Fig. 1.3** Main dopamine pathways in the brain

PET studies using the radioligand [11C]raclopride in humans have consistently demonstrated DA release in central pathways during reward processing. Due to the poor temporal accuracy of PET, it is diffcult to dissect the contribution of reward expectation and consumption phases to the release of DA: It is diffcult to design suffciently long (~45 min) tasks where rewards would be only anticipated but not delivered. As a result, studies conducted in this area mix both anticipation- and consumption-related effects. The PET analysis of DA transmission in reward has shown that feeding—one of the most salient biological rewards—triggers DA release primarily in the striatum. Because the magnitude of DA release is associated with the evaluation of the subjective pleasantness of the meal, this fnding has been interpreted as evidence for hedonic (rather than homeostatic) responses to feeding (Small et al., 2003). This is further supported by another series of studies, which measured DA release during intravenous glucose/placebo delivery, thus precluding the subjective evaluation of the reward value of the glucose, yet systemically altering the blood glucose levels simulating a postprandial state (Haltia et al., 2007, 2008). These studies found no differences between the glucose and placebo conditions, suggesting that alterations in circulating glucose levels are not suffcient for central DA release. Instead, the hedonic responses driven by the orosensory and chemical taste pathways appear to be crucial for the DA response triggered by feeding.

There is less evidence for DA processing of other primary reward signals, but some studies suggest that romantic (Takahashi et al., 2015) and maternal attachmentrelated rewards (Atzil et al., 2017) are processed via the dopamine system in humans. However, these studies are diffcult to interpret as the latter (Atzil et al., 2017) reported dopamine activations in regions where [11C]raclopride has either

low or no specifc binding and no sensitivity to even D2/D3R antagonist challenge (Svensson et al., 2019), and the former was based on an individual-differences approach (Takahashi et al., 2015) and failed to show signifcant main effects of DA release across the whole group of subjects. In addition, murine models typically show a decrease in DA release in response to social contact seeking (Manduca et al., 2014), rather than an increase as suggested by human PET data; this might however be due to cross-species differences in attachment circuits. Striatal DA reward signaling has however been shown to extend beyond biologically signifcant rewards. For example, more "cognitive" rewards such as listening to one's favorite music (Salimpoor et al., 2011), gambling (Joutsa et al., 2012), and playing video games (Koepp et al., 1998) lead to striatal dopamine release. In all of these tasks, the reward value is learned rather than intrinsic, suggesting that acquired reward signals are processed in comparable fashion via DA signaling as those with innate reward value. This is most clearly highlighted by data that shows that simple cognitive tasks such as task switching may trigger striatal DA release as soon as they are coupled with rewards (Jonasson et al., 2014).

Negative emotions also induce DA release. One study using [18F]fallypride revealed increased dopamine release in the amygdala and mediolateral frontal cortex during processing of negative emotional words (Badgaiyan et al., 2009), while a subsequent study using [11C] raclopride found similar effects in the caudate nucleus and putamen (Badgaiyan, 2010). There are multiple possibilities for the apparently contradicting fndings showing that both pleasure and displeasure can lead to DA activation. For example, it is possible that the DA response to negative stimuli refects preparatory avoidance behavior triggered by the aversive stimulus, consistent with the role of DA release in motor responses geared toward specifc behavioral patterns. This might be refected in similar activation as the preparatory approach for rewards during pleasurable events. Finally, type-2 DA receptors (D2R) have also been linked with executive control and working memory (Backman et al., 2011), and the emotion-dependent DA activations might refect the prediction and planning of both escape (negative emotions) and seeking and exploration responses (positive emotions).

Recent PET–fMRI fusion imaging has also tried to dissect the specifc role of DA in processing different aspects of emotions, specifcally the pleasure-displeasure (valence) and arousal axes. This approach is based on separate PET measurement of neuroreceptor distribution, which can then be used to predict emotion-dependent BOLD responses in subsequent fMRI experiments (Karjalainen et al., 2017). The logic of these experiments is to examine whether interindividual variation in the regional BOLD responses is dependent on corresponding variability in neurotransmitter availability, which would be indicative of DA involvement in the emotional processes targeted in the fMRI experiment. However, this work has failed to establish associations between D2R availability and emotion-specifc BOLD responses (Karjalainen et al., 2018) and instead suggests a key role of opioid system in modulating basic affective responses (see below).

Given the central role of dopamine in modulating motivation and reward, it is not surprising that dysregulated dopaminergic neurotransmission is the hallmark of numerous addictive disorders (Volkow et al., 2009). Human imaging studies have demonstrated that alcohol and drug dependence are associated with lowered D2R availability (Martinez et al., 2012; Volkow et al., 1996, 2001). Additionally, druginduced striatal dopamine responses are blunted in methamphetamine abusers (Volkow et al., 2014). With behavioral addictions and addiction-like behaviors, the results are less clear. Animal studies on obesity suggest that striatal D2R is downregulated in the obese brain (Johnson & Kenny, 2010), while human studies have yielded mixed results with some fnding lower (de Weijer et al., 2011; Volkow et al., 2008; Wang et al., 2001) and others unaltered (Haltia et al., 2007, 2008; Steele et al., 2010) D2R availability in the striatum. Finally, pathological gambling is not associated with altered D2R availability (Joutsa et al., 2012). However, gamblingdependent dopamine signaling is amplifed in pathological gamblers versus controls (Joutsa et al., 2012), in contrast to the blunting effect observed in amphetamine abusers upon drug administration (Volkow et al., 2014). In sum, substance abuse appears to markedly downregulate the D2R system possibly via direct pharmacological effects, whereas behavioral addictions and addiction-like states are modulated by at least partially independent pathways.

#### **Opioid System**

Endogenous opioids are expressed widely throughout the human central nervous system (Fig. 1.4) and numerous high-density receptor sites constitute central nodes in the human emotion circuit (Kantonen et al., 2020). Among the three classes of opioid receptors (μ, δ, and κ), the μ receptors mediate the effects of endogenous β-endorphins, endomorphins, enkephalins, and various exogenous opioid agonists (Henriksen & Willoch, 2008). The predominant action of μ-opioids in the central nervous system is inhibitory, but they can also exert excitatory effects. The neurons synthesizing β-endorphin are found in the arcuate nucleus in the hypothalamus and the nucleus tractus solitarii of the medulla, which projects extensively to regions throughout the CNS. Dopamine is oftentimes considered the primary neurotransmitter for reward processing (Wise & Rompre, 1989). Opioid and dopamine systems are however closely interlinked on cellular level (Tuominen et al., 2015), and opioids can produce reward independently of dopamine (Hnasko et al., 2005), likely via partially independent molecular pathways. Moreover, both opioidergic and dopaminergic microstimulation of the nucleus accumbens modulate incentive motivation (DiFeliceantonio & Berridge, 2016; Peciña & Berridge, 2013), suggesting complementary roles of these neurotransmitter systems in motivational and hedonic aspects of reward.

Opiates are commonly used illicit drugs, particularly in the United States, where the lifetime prevalence of opioid use disorder exceeds 2% (Grant et al., 2016). Such high misuse potential is attributed to the strong "liking" responses—the pleasurable subjective experiences produced by drug consumption (Comer et al., 2012). However, experiments with drug-naïve volunteers have not provided consistent

**Fig. 1.4** Organization of the human opioid system in the brain. Note that as specifc opioid neuron projections cannot be established, this fgure instead characterizes the relative expression of different receptor subtypes in some of the key nodes of the emotion circuit

results on opioid agonists associated with liking or pleasure. Some studies report increased pleasure upon μ-receptor (MOR) agonist delivery (Riley et al., 2010; Zacny & Gutierrez, 2003, 2009), whereas others have not corroborated these fndings (Ipser et al., 2013; Lasagna et al., 1955; Tedeschi et al., 1984). These discrepancies likely pertain to differences in the route of administration, receptor affnity, and genetically determined variation in receptor expression (Levran et al., 2012). Some recent experiments have found that opioid agonists shift the evaluation of external stimuli, making them seem more pleasant, without necessarily directly infuencing tonic subjective emotional state per se (Heiskanen et al., 2019). Thus, it is possible that opioid agonists primarily infuence the evaluative processing of emotions, rather than directly modulating the acute subjective feeling. Consequently, opioids might alleviate stress and dysphoria by shifting the evaluation of the internal and external world toward more positive directions.

By contrast, molecular imaging shows that reward consumption consistently triggers endogenous opioid release. Feeding leads to increased endogenous opioid release in the reward circuit and also elsewhere in the brain (Burghardt et al., 2015; Tuulari et al., 2017). However, this response is observed for both palatable and nonpalatable meals and is actually stronger for fast-metabolizing, non-appetizing liquid meals than for palatable pizza. Thus, the response is likely a combination of the low-level homeostatic pleasure of feeding after fasting which is presumably more intense in response to a quickly metabolized liquid meal and possibly a partially independent effect of subjective hedonic responses. Corroborating evidence for the role of the opioid system in processing primary rewards comes from studies showing that pleasurable social interaction (Hsu et al., 2013; Manninen et al., 2017) and strenuous physical exercise (Boecker et al., 2008; Saanijoki et al., 2017) induce central opioid release. Similar to dopamine, these effects extend beyond primary rewards; for example, positive moods induced by mere mental imagery induce opioid release in the amygdala (Koepp et al., 2009). Fusion imaging with PET and fMRI suggests that the opioid system governs particularly the arousal dimension of emotions. The more opioid receptors an individual has in their limbic system, the weaker their arousal-dependent BOLD responses observed in the brain's emotion circuits (Karjalainen et al., 2018). Accordingly, the opioid system might act as a buffer against socioemotional stressors, alleviating the negative feelings associated with one's own or another's misfortune (Karjalainen et al., 2017).

While the general role of the dopamine system in drug addictions is fairly clearcut, the story is more nuanced with the opioid system. Alcohol dependence is associated with elevated MOR levels in the striatum (Heinz et al., 2005; Weerts et al., 2011), whereas cocaine dependence results in similar effects in more widespread regions, particularly cortical and cingulate areas (Gorelick et al., 2005). However, chronic opiate abuse is associated with MOR downregulation (Koch & Hollt, 2008; Whistler, 2012). Thus, the effects of drug abuse on MOR seem to be drug-specifc. More consistent data comes from studies on obesity that have implicated downregulated μ-receptor action as one of the key pathophysiological mechanisms (Burghardt et al., 2015; Karlsson et al., 2015, 2016; Tuominen et al., 2015). These effects are also specifc to obesity rather than a general feature of behavioral addictions, as μ-receptor downregulation is not observed in pathological gambling for example (Majuri et al., 2016). Finally, despite the centrality of the opioid system in hedonia and affective functioning, there is no clear evidence of its involvement in the pathophysiology of mood disorders. PET imaging data are limited in scope, and the existing studies have yielded conficting evidence on opioidergic alterations in major depression (Hsu et al., 2015; Kennedy et al., 2006). However, one recent large-scale study shows that subclinical depressive and anxious symptoms are consistently linked with MOR system downregulation (Nummenmaa et al., 2020). Finally, the opioid system may also contribute to affective pathophysiology due to its role in governing human attachment behavior whose disruptions are consistently linked with mood disorders (Mikulincer & Shaver, 2012). This is supported by PET studies that have consistently found that insecure attachment is linked with downregulated MOR in the limbic and paralimbic regions (Nummenmaa et al., 2015; Turtonen et al., 2021).

#### **Serotonergic System**

The monaomine neurotransmitter serotonin and its receptors 5HT1-5HT7 are involved in the regulation of sleep, appetite, mood, and pleasure, but it is also involved in cognitive and physiological processes. In the central nervous system, serotonin is produced in the raphe nuclei in the brainstem, from where the

**Fig. 1.5** Main serotonin pathways in the brain

serotonergic projections extend to the striatum and neocortex (Fig. 1.5). The brain's serotonergic systems also play a critical role in avoidance behaviors as well as fear and anxiety. Activation of the serotonergic system is critical for avoidance behavior in rodents (Deakin & Graeff, 1991), and genetic variations in serotonin transporter (SERT) expression infuence the fear circuit's responsiveness to acute threat signals in humans (Hariri et al., 2002). Thus, major categories of anxiolytic drugs also inhibit SERT.

While dopamine and opioid systems are centrally involved in the pathophysiology of addictive disorders, the SERT system is consistently implicated in mood regulation and consequently in the pathogenesis of mood disorders (Mann, 1999). Although initial reports on 5-HTT in mood disorders have been variable, metaanalyses suggest that serotonin transporter availability is consistently lowered in depression (Ichimiya et al., 2002); but see Andrews et al. (2015), and altered serotonergic neurotransmission is also considered a hallmark of depression (Drevets et al., 1999). Accordingly, the most widely used and effective of antidepressants act by increasing extracellular serotonin levels. Importantly, individual differences in the expression of the serotonin transporter mediate the effects of stressful life events on the onset of depression (Risch et al., 2009). In a similar fashion, serotonin transporter availability varies seasonally, suggesting that altered serotonergic function may also underlie the pathophysiology of seasonal affective disorders (Praschak-Rieder et al., 2008).

Functional molecular imaging of the serotonergic system has been limited due to the lack of radioligands that show sensitivity to endogenous serotonin levels, essentially preventing serotonin activation studies with PET. However, fusion PET–fMRI imaging has elucidated the role of SERT in emotional processing. A number of studies indicate that the serotonergic system regulates amygdala responsiveness to

facial expressions of emotions (Fisher et al., 2006, 2009; Rhodes et al., 2007; Selvaraj et al., 2015). For instance, PET–fMRI studies have found an inverse relationship between 5-HT1A receptor density in the dorsal raphe nucleus (DRN) or HT2A density in the prefrontal cortex and the magnitude of amygdala BOLD response to emotional faces (Fisher et al., 2006, 2009, 2011; Selvaraj et al., 2015). Some studies have also yielded conficting results, with no association between 5-HT1A binding and emotional face processing (Kranz et al., 2018). For practical and economic reasons, these types of multimodal neuroimaging studies have limited statistical power (oftentimes n:s <30), which may yield inconsistent effects in correlational designs. However, pharmacological activation studies provide corroborating evidence for serotonergic modulation of amygdala responses to threat. Multiple studies have documented that serotonin reuptake inhibitors (SSRIs) modulate amygdala reactivity to emotional facial expressions (Anderson et al., 2007; Bigos et al., 2008; Harmer et al., 2006; Murphy et al., 2009). These effects are however not just face-specifc but extend to emotional processing in general and also to emotions derived from natural speech. The serotonin and norepinephrine receptor antagonist mirtazapine attenuates responses to unpleasant events in sensorimotor and anterior areas while modulating responses to arousing events in cortical midline structures. These effects are paralleled by increased functional connectivity between cortical midline and limbic areas during pleasant events (Komulainen et al., 2017), suggesting large-scale modulation of affective processing by serotonergic drugs.

From a clinical viewpoint, subjective feelings linked with the neural and autonomic emotional response are also an important facet of mood disorders. In particular, negative self-concept and increased self-focus play an important role in the pathophysiology of depression. Some studies suggest that the serotonergic system can infuence how subjects interpret and process self-relevant affective information. Mirtazapine attenuates self-referential emotional processing in healthy volunteers, as manifested in decreased cortical midline activation (Komulainen et al., 2016). This mechanism could underlie one form of serotonin-dependent antidepressant action. This is further evidenced in clinical trials, which show how short-term escitalopram treatment regulates self-referential processing in patients with major depressive disorder (Komulainen et al., 2018). Thus, serotonergic modulation seems to occur at multiple levels of the human emotion circuit, ranging from sensory to evaluative, cognitive and self-referential processes, and the serotonergic action of antidepressants likely impacts all these levels.

#### **Conclusions**

Recent advances in nuclear medicine imaging have helped to elucidate the role of opioid, dopamine, and serotonin systems in human emotions. There is clear evidence that dopamine and opioid systems modulate hedonic processes. However, both dopaminergic and opioidergic activation is observed during negative emotions

too, suggesting that they may also support general motivational and arousalmodulation components of emotions. On pathophysiological level, the dopamine system is more clearly linked with substance abuse and addictive disorders, whereas opioidergic activations vary from substance to substance, with clear downregulation observed particularly in obesity. The serotonin system links more clearly with negative emotions including fear and sadness, yet outside pharmacological and clinical studies, the majority of these data come from pharmacological fMRI studies and those correlating transporter availability with BOLD–fMRI responses.

There is no clear one-to-one mapping between specifc emotions or emotional behaviors and specifc neurotransmitters. Obviously, numerous neurotransmitters have a wide variety of roles, and their specifc actions are not limited to emotional behavior. Human imaging studies are challenging to conduct and are limited by radioligand pharmacokinetics and affnity. For the major neurotransmitter systems implicated in emotion, reliable radioligands exist for imaging serotonin, dopamine, opioid and endocannabinoid receptors and transmitters. For opioid and dopamine systems, there are also radioligands available that are sensitive to endogenous transmitter levels, whereas this has yet to be achieved for serotonin and endocannabinoid systems. In sum, targeting neurotransmitter mechanisms of emotions using PET is a powerful tool for dissecting the molecular mechanisms of emotions, further potentiated by next-generation PET–MRI devices which allow us to address the molecular specifcity of emotion-related BOLD activation.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 2 The Neurocognitive Mechanisms of Unconscious Emotional Responses**

**Wataru Sato**

**Abstract** The neurocognitive mechanism of emotion without conscious awareness has long been a subject of great interest (Pribram KH, Gill MM, Freud's "project" re-assessed: preface to contemporary cognitive theory and neuropsychology. Basic Books, 1976). Several pervious psychological studies have used subliminal presentations of emotional facial expressions in the context of the affective priming paradigm to investigate unconscious emotional processing (e.g., Murphy ST, Zajonc RB, J Person Soc Psychol 64:723–739, 1993; for a review, see Eastwood JD, Smilek D, Conscious Cognit 14:565–584, 2005). In a typical application of this paradigm, a facial expression depicting a negative or positive emotion is fashed briefy as a prime, then an emotionally neutral target (e.g., an ideograph) is presented. Participants are asked to make emotion-related judgments about the target. The studies reported that evaluations of the target were negatively biased by unconscious negative primes, compared to positive primes. This effect has been interpreted as evidence that unconscious emotion can be elicited and that it affects the evaluation of unrelated targets.

**Keywords** Unconscious emotional responses · Amygdala · Subcortical visual pathway · Emotional states

#### **Introduction**

The neurocognitive mechanisms for emotion without conscious awareness have been a long-standing topic of research (Pribram & Gill, 1976). Several pervious psychological studies have investigated unconscious emotional processing by

© The Author(s) 2023

W. Sato (\*)

Psychological Process Research Team, Guardian Robot Project, RIKEN, Kyoto, Japan e-mail: wataru.sato.ya@riken.jp

P. S. Boggio et al. (eds.), *Social and Affective Neuroscience of Everyday Human Interaction*, https://doi.org/10.1007/978-3-031-08651-9\_2

means of the paradigm of subliminal affective priming (e.g., Murphy & Zajonc, 1993; for a review, see Eastwood & Smilek, 2005). In these studies, a facial expression displaying negative or positive emotion is presented subliminally as a prime, then an emotionally neutral target, such as an ideogram, is presented supraliminally. Participants are instructed to evaluate the target. The studies showed that unconscious negative primes bias evaluations of the target more negatively than positive primes. This effect has been discussed as evidence that emotion can be unconsciously evoked, and that it modulates the evaluation of subsequent targets.

The subliminal affective priming paradigm, however, does not always produce clear effects, and several previous studies have failed to fnd the effects (e.g., Kemps et al., 1996). While Murphy and Zajonc (1993) found that the priming effect is stronger with subliminal than supraliminal emotional primes, the use of a very short presentation duration for stimuli may prevent even unconscious processing of the stimuli.

Furthermore, the neural mechanisms for unconscious emotional processing remain unclear. Although several neuroimaging (e.g., Morris et al., 1998) and neuropsychological (e.g., Kubota et al., 2000) studies have suggested that the amygdala plays an indispensable role in this process, previous fndings are not consistent and debate remains in the literature (Pessoa & Adolphs, 2010). Additionally, the neural pathways underlying unconscious emotional processing remain unexplored. While some studies provided correlational data suggesting that the subcortical visual pathway sends information to the amygdala to implement unconscious emotional processing (e.g., Morris et al., 1999), there was no causal evidence. Besides, the accurate timing data of amygdala emotional processing was scarce.

In this chapter, I present the fndings of our psychological and neuroscientifc studies that investigated these issues. Our psychological experiments revealed that emotion arises rapidly and unconsciously. We identifed the neural mechanisms for this process using functional magnetic resonance imaging (fMRI) and intracranial electroencephalography (EEG).

#### *Psychological Study of the Unconscious Processing of Emotional Expressions*

First, we tried to clearly demonstrate that emotional responses arise prior to the conscious awareness of the stimuli evoking such responses using the subliminal presentation of dynamic facial expressions. Dynamic facial expressions may be relevant in this regard because these are more ecologically valid than static expressions. Some previous psychological research has shown that dynamic facial expressions induce more obvious behavioral responses, such as subjective emotion elicitation (Sato & Yoshikawa, 2007b) and facial mimicry (Sato & Yoshikawa, 2007a), than static expressions. These data imply that it is advantageous to use dynamic rather than static facial expressions when attempting to elicit unconscious emotions.

We tested 22 healthy participants. As prime stimuli, we presented dynamic and static facial expressions of fear and happiness during 30 ms. The raw materials of the primes were grayscale photographs of facial expressions depicting fearful, happy, and neutral emotions, and they were used to create dynamic facial expressions by a morphing method. First, facial expressions with 34% and 66% intensities were created, and then 34%, 66%, and 100% facial expressions were displayed in succession to create a dynamic clip. The presentation duration for each image was 10 ms; therefore, the duration of each clip was 30 ms. The photographs of 100% facial expressions were presented as static expressions during 30 ms. A randomized mosaic image was made using a neutral face photograph by splitting the photo into pieces and randomly reordering them. The target stimuli were emotionally neutral ideograms. In each trial (Fig. 2.1), after a fxation cross, a prime stimulus was presented to either the left or right hemi-visual feld; this was immediately replaced by a mask in the same place during 300 ms. Directly afterward, the target ideogram

**Fig. 2.1** Sato et al.'s (2014b) study. (Upper) An illustration of the trial sequence. The prime stimuli of dynamic and static facial expressions of fear and happiness were presented subliminally. (Lower) Mean (± *SE*) preference ratings. The asterisks indicate a signifcant difference between the fear and happiness conditions

was displayed at the same location during 1000 ms. Finally, the rating scale was displayed and participants rated their preference for the target ideogram. After the subliminal priming task, we conducted a forced-choice recognition session and confrmed that no participant had consciously perceived the prime stimuli.

For the preference ratings (Fig. 2.1), the results of our analysis of variance (ANOVA) with presentation condition (dynamic or static) and emotion (fear or happiness) as factors indicated that the interaction was signifcant. Follow-up simple effect analyses revealed that the effect of emotion was signifcant under the dynamic, but not static, presentation condition, indicating that the subliminal presentation of dynamic fearful versus happy facial expressions reduced preferences for targets.

The results demonstrated that dynamic facial expressions induce evident subliminal affective priming effects. These results extend our understanding of unconscious emotional processing and the boosting effect of dynamic facial expressions. No clear subliminal effects were detected under the static condition. The presentation duration may not have been suffcient to activate the emotional processing with static stimuli.

These results provide hints about the neural mechanism for the unconscious processing of facial expressions, implying that the mechanism is sensitive to dynamic information. This notion is in agreement with the neuroimaging fnding that the unconscious emotional processing of facial expressions is performed via the subcortical visual pathway into the amygdala comprising the pulvinar and superior colliculus (Morris et al., 1999). Studies on anatomical connections in animals (Day-Brown et al., 2010) and humans (Tamietto et al., 2012) have revealed that the amygdala receives visual information through the subcortical pathway. Regarding the effect of visual motion information on these brain structures, a neuroimaging study in humans (Schneider & Kastner, 2005) and numerous physiological studies in animals (for a review, see Waleszczyk et al., 2004) indicated that the superior colliculus is more sensitive to dynamic than static information. Together with these data, our results suggest the possibility that unconscious emotional processing is implemented by the activation of the amygdala via the subcortical pathway.

#### *Psychological Study of the Unconscious Emotional Processing of Food*

Next, we tried to test the generalizability of unconscious emotional responses and their impact on daily behaviors using food stimuli. Emotional responses to food have important consequences for humans, both positively (e.g., facilitating wellbeing) and negatively (e.g., triggering overeating and lifestyle-related diseases). Previous psychological studies have shown that both the observation and ingestion of food evoke positive emotional reactions (Rodríguez et al., 2005), which in turn stimulate food intake (for a review, see Sørensen et al. 2003).

However, whether emotional responses to food could be unconsciously elicited remained unknown. As we discussed above, several psychological studies using the subliminal affective priming paradigm have shown that non-food emotional stimuli (e.g., facial expressions) induced unconscious emotional processing. On the basis of this evidence, we hypothesized that emotional responses to food would also be unconsciously activated.

In addition, we expected that unconscious food processing would have an infuence on daily eating habits. Previous psychological studies have reported that eating habits can be assessed using self-reported questionnaires such as the Dutch Eating Behavior Questionnaire (DEBQ) (van Strien et al., 1986). The DEBQ assesses some eating habits related to overeating. Among the DEBQ sub-scales, several previous studies have shown that the external eating tendency, defned as eating behaviors in response to external (e.g., visual and olfactory) food stimuli, modulates automatic food processing (e.g., attentional shift to food; Brignell et al., 2009). Based on these data, we hypothesized that unconscious emotional reactions to food could be associated with external eating tendency.

To examine these hypotheses, we examined unconscious and conscious emotional responses to food and non-food stimuli and the relationships between these responses and eating habits (Sato et al., 2016). We tested 34 healthy participants. All participants fasted for more than 3 h prior to the experiment. Unconscious emotional responses were tested using the subliminal affective priming paradigm (Murphy & Zajonc, 1993). Food stimuli were color photographs of fast food (e.g., hamburgers) and Japanese diet (e.g., grilled teriyaki fsh) (Fig. 2.2). Randomized mosaic stimuli were made from the food stimuli; all food stimuli were split into small squares and randomly sorted. A mask stimulus was also prepared by creating a randomized mosaic pattern. The photographs of neutral faces were used as targets under the subliminal condition. The target stimuli were randomly assigned to the experimental conditions. We used the Japanese version of the DEBQ (van Strien et al., 1986) to assess eating habits related to overeating. In each trial under the subliminal condition, a food or mosaic prime was displayed during 33 ms in the left or right hemi-visual feld after a fxation cross; this was immediately replaced by a mask image during 167 ms. The target face was then displayed in the center during 1000 ms. Finally, the response panel was displayed and participants rated their preferences for the target faces. In each trial under the supraliminal condition, after the presentation of the fxation cross, a food or mosaic target was displayed during 200 ms in the left or right hemi-visual feld. Participants rated their preferences for the target images. A following forced-choice recognition task was conducted to ensure that none of the participants had consciously recognized the primes.

Under the subliminal condition, the ANOVA with stimulus type (food or mosaic) as a factor revealed a signifcant main effect of stimulus type, demonstrating higher preference ratings for faces primed by food images than those for faces primed by mosaics (Fig. 2.2). Similarly, under the supraliminal condition, the main effect of stimulus type was signifcant, showing higher preference ratings for food images than for mosaics. Correlation analysis showed a signifcant positive correlation

**Fig. 2.2** Sato et al.'s (2016) study. (Upper) The illustrations of food and mosaic stimuli. (Lower left) Mean (± *SE*) preference ratings in the subliminal condition. The asterisk indicates a signifcant difference between food and mosaic prime conditions. (Lower right) A scatter plot with a regression line showing a relationship between food preference scores under the subliminal condition and external eating tendency. The asterisk indicates a signifcant association

between food preference scores under the subliminal condition and external eating tendency (Fig. 2.2).

These results revealed that unconscious emotional responses are elicited by food stimuli. The data, together with other evidence, suggest that unconscious emotional responses can be triggered by various types of stimuli, including emotional expressions and food. Moreover, the results demonstrated that the unconscious emotional responses to food are positively associated with the tendency for external eating. This suggests that unconscious emotional reactions play a key role in behaviors in daily life.

#### *fMRI Study of the Neural Mechanisms for Unconscious Emotional Processing of Food*

Next, we explored the neural mechanisms for unconscious emotional processing using visual food stimuli. Several prior fMRI studies have investigated neural activity in response to supraliminally presented food images. These studies consistently

reported that some brain regions, including the neocortical visual areas (e.g., the fusiform gyrus) and limbic regions (e.g., the amygdala), are activated more strongly in response to food images than non-food images (e.g., Holsen et al., 2005; for a review, see van Meer et al., 2015). Accordingly, some scholars proposed that the neocortical visual areas are involved in the visual recognition of food images, which, in turn, activates the amygdala and other related regions for emotional processing (Chen et al., 2016).

However, the neural mechanisms underpinning the unconscious emotional responses to food remained unknown. In other literatures, several previous neuroimaging studies have reported that the unconscious processing of emotional expressions activates the amygdala (e.g., Morris et al., 1998). A few neuropsychological studies also found an indispensable role of the amygdala in the unconscious processing of emotional scenes (e.g., Kubota et al., 2000). Based on these fndings, we hypothesized that the amygdala could be activated during both conscious and unconscious emotional processing of food.

Furthermore, prior neuroimaging studies investigating facial expression processing have found that neural pathways are different between conscious and unconscious emotional processing. Some studies provided evidence, though correlational and non-causal results, that emotional information in facial expressions is transmitted unconsciously through the subcortical pathway to the amygdala, such as the superior colliculus and pulvinar (e.g., Morris et al., 1999). It was also reported that the visual pathways involved in conscious and unconscious processing of emotional facial expressions differ (e.g., Vuilleumier et al., 2001). On the basis of such evidence, we hypothesized that the visual pathways to the amygdala for conscious and unconscious processing of food would differ and that subcortical structures would be involved in unconscious food processing.

In this study (Sato et al., 2019), we tested these hypotheses by measuring fMRI while participants viewed supraliminally or subliminally presented food images. We examined the commonalities and differences in neural responses to food versus mosaic images across presentation conditions. Furthermore, we conducted dynamic causal modeling and compared models with the subcortical, cortical, and dual visual pathways to the amygdala. We tested 22 healthy participants, all of whom had fasted for more than 3 h before the experiment. The stimuli presented were identical to those used in the above psychological experiment (Sato et al., 2016). Color photographs of fast food and Japanese diet and their corresponding randomized mosaic images were used. The participants completed two runs of 128 trials using a block design. Each run included one of the presentation conditions, and the order was fxed to the frst subliminal and second supraliminal conditions. In each trial, a food or mosaic image was displayed in the center after a fxation cross. Under the subliminal conditions, the stimulus was displayed during 17 ms, immediately replaced by a mask for 1483 ms. Under the supraliminal condition, the stimulus was displayed during 1500 ms without mask presentation. In eight trials pseudo-randomly placed throughout the task blocks, a red cross was displayed during 1500 ms as the target, instead of the food or mosaic images. Participants were instructed to perform a dummy task to detect the red cross.

We performed a conjunction analysis to determine commonalities in neural responses to food versus mosaic images across presentation conditions. The results showed signifcantly stronger activation in the bilateral amygdala in response to food than mosaic images under both the subliminal and supraliminal conditions (Fig. 2.3). We conducted the interaction contrast between stimulus type and presentation condition to analyze differences in neural responses to food versus mosaic

**Fig. 2.3** Sato et al.'s (2019) study. (Upper left) Statistical parametric maps showing signifcant neural activation in response to food versus mosaic images under both the subliminal and supraliminal conditions and mean (± *SE*) effect size differences between the food and mosaic conditions. The blue cross indicates the activation focus at the right amygdala. (Upper right) Statistical parametric maps showing signifcantly stronger neural responses to food versus mosaic images under the supraliminal than subliminal condition and mean (± *SE*) effect size differences between the food and mosaic conditions. The blue cross indicates the activation focus at the right fusiform gyrus. (Lower) Models (left) and model comparison results (right) of dynamic causal modeling. The solid and dashed arrows indicate modulatory connections in the subcortical and cortical pathway models, respectively. The dual pathways model contains both pathways. The model comparison results in the right hemisphere are shown. *AMY* amygdala, *FG* fusiform gyrus, *PUL* pulvinar, *V1* primary visual cortex

images across presentation conditions. The results showed signifcantly stronger activation for food versus mosaic images under the supraliminal than subliminal condition in the broad bilateral posterior regions, including the fusiform gyrus (Fig. 2.3).

We performed dynamic causal modelling to determine the visual pathway to the amygdala in each hemisphere. We compared the models in which the subcortical (pulvinar–amygdala), cortical (primary visual cortex–fusiform gyrus–amygdala), and dual visual pathways were functionally coupled with the amygdala specifcally during food processing (Fig. 2.3). In both hemispheres, the model comparison indicated that the subcortical pathway model was the most likely under the subliminal condition, while the dual pathways model was optimal under the supraliminal condition (Fig. 2.3).

Our results demonstrated that the amygdala is active in response to food images in both the subliminal and supraliminal conditions. These results imply that the amygdala is commonly involved in the unconscious and conscious emotional processing of food. These results are consistent with, and extend the substantial neuroimaging and neuropsychological evidence indicating, the involvement of the amygdala in the processing of stimuli with emotional signifcance (e.g., Sato et al., 2004). The visual areas were activated in response to supraliminally versus subliminally presented food. The neocortical visual areas may be related to the conscious perception of food.

Furthermore, our dynamic causal modeling analyses provide causal evidence that the amygdala is activated by visual food stimuli through the subcortical visual pathway before the conscious recognition of food occurs. Subsequently, the amygdala receives the processed visual information of food through the neocortical pathway. In addition to the aforementioned anatomical (Day-Brown et al., 2010; Tamietto et al., 2012) and neuroimaging (Morris et al., 1999) fndings, our model of the subcortical visual input to the amygdala under the subliminal condition is consistent with data showing that a patient with damage in the neocortical visual areas showed amygdala activity, which was functionally coupled with pulvinar activity, in response to unseen emotional expressions (Morris et al., 2001).

#### *Intracranial EEG Study of the Neural Processing of Emotional Expressions*

Here, we tried to demonstrate rapid amygdala activation during emotional processing using facial expression stimuli. As described above, a number of neuroimaging studies have shown that the amygdala is active during the visual processing of emotional stimuli, such as emotional facial expressions and palatable food, even in the absence of conscious awareness of the stimuli (e.g., Morris et al., 1999). Some researchers proposed that the amygdala may be activated during an early stage of

emotional processing because the amygdala receives sensory input from the subcortical pathway.

However, the temporal profle of the amygdala activation, specifcally during the processing of emotional facial expressions, remained unclear. Some studies have examined this issue by recording magnetoencephalography while participants observed emotional facial expressions and found that stronger amygdala responses to fearful/threatening than neutral expressions occurred rapidly, approximately 100 ms after the stimulus onset (e.g., Luo et al., 2007). However, the results were inconsistent and there remains debate over whether the activity of such a deep and complex brain structure as the amygdala can be appropriately estimated from scalprecorded electromagnetic signals (Papadelis et al., 2009).

Intracranial EEG recordings can offer direct evidence of electric neural activity with high temporal resolution. In this regard, a previous study examined amygdala activity while participants viewed negative, positive, and neutral scenes by employing intracranial EEG recordings and time–frequency analyses (Oya et al., 2002). The results showed stronger gamma-band (around 40 Hz) oscillations in the amygdala in response to negative scenes, as compared with both positive and neutral scenes, as early as 50–150 ms after stimulus onset. Based on these data, we hypothesized that the amygdala could reveal similar rapid gamma-band oscillations while viewing other emotional stimuli, i.e., fearful facial expressions. In this study (Sato et al., 2011), to test this hypothesis, we recorded the intracranial EEG from the amygdala while participants observed fearful, happy, and neutral facial expressions.

We tested six patients. All patients suffered from pharmacologically resistant epilepsy and their intracranial EEG was recorded in a presurgical evaluation. Surgical and electrophysiological assessments suggested that the main epileptic foci were outside the amygdala. Pre- and post-implantation anatomical assessments showed no structural abnormalities in any patient's bilateral amygdala. Implantation of intracranial electrodes was performed according to a stereotactic method (Mihara & Baba, 2001). Post-implantation anatomical MRI assessments ensured that the target electrodes were located in the amygdala (Fig. 2.4). The stimuli consisted of grayscale photographs of seven individuals' faces depicting fearful, happy, and neutral expressions. In each trial, after a fxation cross, the stimulus was displayed during 1000 ms in the center of the screen. The response panel was then displayed and the participants performed a dummy task to specify the gender of the displayed faces.

Time–frequency statistical parametric mapping analyses for the comparison between fearful and neutral expressions revealed signifcant gamma-band activity between 50 and 150 ms (starting before 100 ms; Fig. 2.4).

What are the implications of rapid amygdala activity triggered before 100 ms for our understanding of emotional processing? Numerous previous scalp- (e.g., Bentin et al., 1996) and subdurally-recorded (e.g., Sato et al., 2014a) EEG studies have reported that the frst face-specifc visual analysis in the neocortical visual areas occurs after 100 ms. Together with such fndings, our data imply that the emotional processing of facial expressions in the amygdala is faster than the frst visual analysis of faces in the neocortex. Furthermore, another line of scalp-recorded EEG research that investigated conscious awareness of visual stimuli has shown that the

**Fig. 2.4** Sato et al.'s (2011) study. (Upper) Representative anatomical magnetic resonance image. The red cross indicates the location of the amygdala electrode. (Lower) Statistical parametric maps for amygdala gamma-band activation for fearful compared with neutral facial expressions (left) and mean (± *SE*) effect size at the peak activation focus (right)

negative defection at the posterior cortices from 200 to 400 ms was greater in response to seen than unseen stimuli (e.g., Genetti et al., 2009; for a review, see Koivisto & Revonsuo, 2010). Together with these fndings, our results suggest that amygdala activity at about 100 ms refects the emotional processing that takes place prior to the conscious perception of the stimuli.

#### **Conclusion**

In summary, our psychological data demonstrate that humans have psychological mechanisms for unconscious emotional processing. The fndings presented in section "Psychological study of the unconscious emotional processing of food" suggest that such unconscious emotional responses are general and play important roles in daily life. The fMRI data presented reveal that the amygdala is involved in emotional processing via the subcortical pathway prior to conscious awareness of the stimuli. The intracranial EEG data described demonstrate that the amygdala is rapidly activated in response to emotional stimuli, specifcally after approximately 100 ms. Taken together, these fndings suggest that the amygdala implements rapid and unconscious emotional processing via the subcortical pathways at approximately 100 ms.

These fndings have implications for human behavior. For example, frst, the model suggests that rapidly evaluating stimulus emotional signifcance (via the subcortical visual pathway and unconscious and rapid amygdala activity) is mandatory, and diffcult to consciously prevent. Therefore, people should acknowledge such psychological mechanisms and take precautions or slowly adjust their behaviors to mitigate rapid emotional responses. For example, when someone wants to control their eating behaviors, they should not visit food-abundant environments, such as supermarkets and convenience stores. Second, the model suggests that subjective emotional states could provide valuable information about the rapid and unconscious evaluative processes that take place in the amygdala. For example, if one feels slightly negative or positive feelings during social interaction, this subjective information may indicate that our amygdala has automatically detected subtle biologically or socially signifcant messages.

**Acknowledgments** The author would like to thank Dr. Paulo Sérgio Boggio and Dr. Tanja S. H. Wingenbach for their advice. This study was supported by funds from Research Complex Program from Japan Science and Technology Agency.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 3 Social and Affective Neuroscience of Embodiment**

**Marília Lira da Silveira Coêlho , Tanja S. H. Wingenbach , and Paulo Sérgio Boggio**

**Abstract** Embodiment has been discussed in the context of social, affective, and cognitive psychology, and also in the investigations of neuroscience in order to understand the relationship between biological mechanisms, body and cognitive, and social and affective processes. New theoretical models have been presented by researchers considering not only the sensory–motor interaction and the environment but also biological mechanisms regulating homeostasis and neural processes (Tsakiris M, Q J Exp Psychol 70(4):597–609, 2017). Historically, the body and the mind were comprehended as separate entities. The body was considered to function as a machine, responsible for providing sensory information to the mind and executing its commands. The mind, however, would process information in an isolated way, similar to a computer (Pecher D, Zwaan RA, Grounding cognition: the role of perception and action in memory, language, and thinking. Cambridge University Press, 2005). This mind and body perspective (Marmeleira J, Duarte Santos G, Percept Motor Skills 126, 2019; Marshall PJ, Child Dev Perspect 10(4):245–250, 2016), for many years, was the basis for studies in social and cognitive areas, in neuroscience, and clinical psychology.

**Keywords** Embodiment · Empathy · Racial bias · Social embodiment · Emotion embodiment

e-mail: marilialira.coelho@mackenzie.br

T. S. H. Wingenbach

M. L. da Silveira Coêlho (\*) · P. S. Boggio

Social and Cognitive Neuroscience Laboratory, Developmental Disorders Program, Center for Health and Biological Sciences, Mackenzie Presbyterian University, São Paulo, Brazil

School of Human Sciences, Faculty of Education, Health, and Human Sciences, University of Greenwich, Greenwich, London, UK

#### **Introduction**

Embodiment has been discussed in the context of social, affective, and cognitive psychology, and also in the investigations of neuroscience in order to understand the relationship between biological mechanisms, body and cognitive, and social and affective processes. New theoretical models have been presented by researchers considering not only the sensory–motor interaction and the environment but also biological mechanisms regulating homeostasis and neural processes (Tsakiris, 2017).

Historically, the body and the mind were comprehended as separate entities. The body was considered to function as a machine, responsible for providing sensory information to the mind and executing its commands. The mind, however, would process information in an isolated way, similar to a computer (Pecher & Zwaan, 2005). This mind and body perspective (Marmeleira & Duarte Santos, 2019; Marshall, 2016), for many years, was the basis for studies in social and cognitive areas, in neuroscience, and clinical psychology.

However, the dichotomous discussion of mind and body has been replaced by an approach that considers the individual's integrality. Embodiment, in turn, arises from the connection between body, emotions, brain, and environment (Marshall, 2016). Thus, the body is no longer seen as a simple sensory–motor interface, neither is the mind seen as a set of logical functions and isolated cognitive abilities. Together, the body and mind become an integral biological system modulated by experiences provided by homeostatic self-regulation interconnected with interactions with other individuals and with the environment (Marmeleira & Duarte Santos, 2019). In this perspective, embodiment is understood as a representation of the self and its interaction with the world. In this chapter, we are discussing embodiment in both social and affective processes.

#### **Neuroscience of Embodiment**

Embodiment is experienced through representations in the brain based on simulations of predictions and patterns constructed by our experiences both at the perceptual and motor level (Barrett, 2017; Longo & Tsakiris, 2013). Our perceptual experience occurs through sensory inputs, such as auditory, visual, or vestibular sensations, and also through somatic experiences, such as touch, pain, vibration, and the position of the body itself. To exemplify, let us consider the action of grasping a pen with the fngers. The tactile sensation when touching the pen is temporally and spatially congruent with seeing the fngers grasping the pen. Incoming visual information about the location of the body (i.e., fngers grasping the pen) is processed by the visual cortex and is related to a somatic representation of the perception of the visual space around the body parts (Holmes & Spence, 2004; Kilteni et al., 2015). The execution of the motor action (here: grasping a pen) includes efferent motor signals and the associated touch sensation includes afferent feedback. The synchronous integration of the visuo-tactile and proprioceptive signals promotes the experience of the moving body parts being perceived as one's own (Longo & Tsakiris, 2013; Tsakiris, 2010). These integration processes allow to differentiate between one's own perceptual experiences and those of others but also serve as the basis for experiences being grounded in one's body (hence, embodiment).

The brain areas of the posterior parietal cortex (PPC) and ventral premotor cortex (MPCv) play a fundamental role in the perception of the body and the surrounding space (Holmes & Spence, 2004). Visual–somatosensory coordination includes encoding the position of the body in space and comparing the felt with the seen position. Multisensory neurons respond to tactile and proprioceptive stimulation (e.g., touch sensation when grasping a pen and knowledge of the hand's location in space), but also to visual stimulation (seeing the hand moving and the fngers grasping the pen) (Graziano, 1999; Zopf et al., 2010). The representation of an action can be used in simulations to predict sensations and to track mismatches between sensory predictions and real perception of the sensory environment (Barrett, 2017). The continuous coupling of visuo-tactile and proprioceptive signals can explain the strong neural connections between the visual, motor, and somatosensory cortex.

These processes can be facilitated by specifc neurons that fre both during action observation and action execution. Early monkey studies showed that some neurons (in the pre-motor brain area F5) fre during action observation as well as action execution (Gallese, 2007; Rizzolatti et al., 1996) which serves as a potential explanation for simulation processes and understanding others' actions. These neurons are now called mirror neurons. In the study conducted by Rizzolatti et al. (1996), it was discovered that some neurons fred when the monkey saw a grasping action and it were the same neurons that fred when the monkey was performing a grasping action. Another experiment included a second monkey and a human experimenter and a similar response of this group of neurons was found (Rizzolatti et al., 1996). These results demonstrate the activation of the mirror neuron system when observing movement-related action. Mirror neurons were found to be somatotopically arranged in the premotor cortex and reciprocally connected in the posterior parietal cortex; these areas are considered analog to the areas containing mirror neurons in monkeys (Rizzolatti et al., 1996).

The experiments in monkeys revealed that in addition to the activation of the F5 area for observation of the action and execution of the action, this brain area is also active during partially hidden observation, when it is possible to predict the result of the action, even in the absence of complete visual information of the execution of the action and interaction with the target object. Umiltà et al. (2001) conducted a study with monkeys with two experimental conditions: "total" vision condition, when the monkey was shown a fully visible action directed at an object (hand– object interaction), and the "partial" vision condition, when the same action was shown, but the fnal part of the action was hidden. The results showed that there was activation of mirror neurons in the F5 area in both experimental conditions (Umiltà et al., 2001), which provides support to suggest that the understanding of the action can be based on predictions of the internal motor representation of the action, through the anticipation of the fnal objective of the action performed by others, and, therefore, this mechanism can be understood as a precursor of more sophisticated skills of understanding the intention of others (Gallese, 2007).

Gallese (2007) calls the mechanism of mirror neurons capable of helping us understand others "incorporated simulation." The incorporated simulation theory by Gallese (2007) proposes that the mirror neuron system may be involved in processes of social cognition, such as understanding others' actions and intentions, attributing mental states to others, and language. Other studies suggest that the mirror neuron system is involved in social cognition processes, such as facial expression recognition and ultimately empathy (Mier et al., 2010; Schulte-Rüther et al., 2007); the mirror neuron system is thought to include the fusiform gyrus, superior temporal sulcus, posterior parietal cortex, ventral premotor cortex, and tonsil (Schmidt et al., 2021).

Overall, such evidence suggests that there is an embodied nature to actions and cognitive processes. This embodiment makes it possible to run simulations to guide action and to use such internal models to give meaning and coherence to sensations. (Barrett, 2017). Thus, brain simulations function as flters for sensory stimulus inputs, driving action, and constructs perception of both cognition and emotions (see Barret review, 2017). Conversely, the manipulation of multisensory stimuli can modulate representations of the body and create perceptual illusions of body parts and embodiment illusions of the self and self-other (see next section). Having touched upon the neuroscience of embodiment, this chapter continues by delving into social and affective processes that can be explained by embodiment.

#### **Embodiment and Social Embodiment**

Embodiment is centered on our subjective experiences grounded in our physical body (Gillihan & Farah, 2005). It is through this bodily self-awareness that we understand that we have a body, that we feel it as our own, that it occupies a place in space, and that there is a space around it. The formation of this body selfawareness depends on the integration of bodily signs of different sensory modalities, which signal the location of body parts and of the entire body in space, as well as providing information that we are within this body. Therefore, this body assumes the perspective of the "self" in experimentation and interaction with the world (Blanke et al., 2015; Mul et al., 2019).

Embodiment from the internal body representation perspective can be expressed through the sensations of body ownership and of motor agency. The sensation of agency precedes a motor action, and it involves the efferent component because centrally generated motor commands precede a voluntary movement (Tsakiris et al., 2006). It is the intention and execution of actions that allow the sensation of movement control of the body in a given task (Gallagher, 2000; Tsakiris et al., 2006). Body ownership is related to the sensation of the presence of the body itself. According to Gallagher (2000), it is the feeling that "my body" belongs to me, and it is always present in one's mental life. This feeling of embodiment is present

during motor actions in performing a task, as well as during passive bodily experiences such as being touched (Tsakiris et al., 2006). The body scheme's neural construction is formed throughout life: a dynamic update based on sensory cues experienced by the body and its interaction with the environment (Cardinali et al., 2009). Hence, we learn cognitive and motor skills and the perception of our own body based on these sensory experiences. Embodiment is modulated by bodily experiences, but also by affective experiences and internal body representation (Braun et al., 2018; Marmeleira & Duarte Santos, 2019).

Therefore, the sense of body ownership should be considered as a result of external sensory stimuli that integrate different sensory signals (somatosensory, vestibular, visual, somatosensory) to the formation of body perception (Botvinick & Cohen, 1998; Kilteni et al., 2015; Tsakiris, 2010), and internal, interoceptive stimuli, which form the internal body representation. This multisensory information interacts with motor systems in motor action, making it possible for the body scheme to locate and perceive a body part's position in space (Margolis & Longo, 2014; Medina & Coslett, 2010), contributing to the implementation of actions involved in the interaction with the environment (Assaiante et al., 2014).

The plasticity of the multisensory integration, through simultaneous sensory stimuli of spatial and temporal congruence, has been vastly studied, showing that bodily representations and peripersonal space can be modulated after seconds of sensory manipulation, incorporation of instruments, mirror images, and use of inanimate objects such as a rubber hand. Synchronous visuo-tactile or visuomotor interactions make it possible to change one's perception of peripersonal and body space, which can modify the body scheme and induce the sensation of body ownership, including someone else's body part, as in the rubber hand illusion (Botvinick & Cohen, 1998; Holmes & Spence, 2004; Kilteni et al., 2015). In this illusion, the participant's hand is occluded from their vision and replaced by a prosthesis with similar characteristics, positioned close to the body aligned with the shoulder. In order for the illusion to occur both the real and the fake, hands must be touched synchronously in time and precisely in the same location. This visuo-tactile–proprioceptive interaction generates a confict of what is seen in the prosthesis and what is felt in the hand, and it promotes incorporation of the rubber hand by the body scheme and the sensation of body ownership (Botvinick & Cohen, 1998). Thus, the illusions that manipulate the sense of body ownership are potentially experimental tools for investigating body representation and peripersonal space (Costantini & Haggard, 2007).

In this context, it is possible to suggest that self-awareness is highly malleable and infuenced by external sensory information as evidenced by several studies. However, in addition to external sensory information, we have internal representations formed by interoceptors that allow us to have consciousness of our body (Tsakiris, 2017). Craig (2009) presented in a review that interoceptive representations contained in the insular cortex provide a basis for the subjective feelings of body and consciousness. The insula is the interoceptive center in the brain, and it plays a fundamental role in the representation of self-awareness involving the integration of external stimuli arising from the environment and the feeling of agency

and control of one's own body. The insula is also linked to the affective processing of the self and the other and of processes of social cognition, such as empathy, representation of oneself, and sense of identity (Craig, 2009; Tsakiris, 2017). Thus, interoception plays a fundamental role in the self-awareness and in the stability of the internal representation that, despite the infuences of exteroceptive signs and social interaction with others, maintains the representation of the body's selfawareness as being "mine" (Tsakiris, 2017).

From this perspective, social neuroscience began studying embodiment in order to have a better understanding of social perception, attitudes, and emotion of the self and the others (Niedenthal & Barsalou, 2005). Studies have shown that embodiment can be infuenced by social experiences and by the processes of perceived social information, which makes us susceptible to experiencing overlap of body representation of the other (Tsakiris, 2017). Sforza et al. (2010) demonstrated that by synchronic touching, the face of people who were seeing simultaneous touches on a partner's face, induced by the "enfacement" illusion, the partner's facial characteristics were incorporated in the representation of the participant's own face; the same did not happen during asynchronic touch. Similar results were found in the study carried out by Tajadura-Jiménez and Tsakiris (2014); in addition, the authors showed the role of individual interoceptive sensitivity in the modulation of exteroceptor signals by stimulation multisensory synchronic recognition. These fndings suggest that the sense of body ownership is malleable through multisensory integration, and it is possible to induce the sense of ownership of the part of the body of the other as being my own body, yet the perception of the recognition of the body itself, as distinct from others, is weighted by individual interoceptive sensitivity.

From studies on the embodiment of the self and the other, it is possible to demonstrate how perceptual illusions can modulate multisensory integration, but also the social perception of the other. In the study conducted by Paladino et al. (2010), the sensation of being touched synchronously to the observed touch of another person provoked more positive affective reactions than in the asynchronous condition. In addition, participants felt closer to the other person and perception of face resemblance was increased. Other studies were conducted in order to understand whether the modulation of social perception in the embodiment of the other can infuence racial bias. Peck et al. (2013), through virtual reality, investigated whether the embodiment of light-skinned people in virtual bodies of dark skin, light skin, purple skin, and without virtual body modulated the implicit racial bias. The results revealed that the implicit racial bias decreased when the dark-skinned virtual body was incorporated. Farmer et al. (2014) used the rubber hand illusion with black-andwhite hands in Caucasian participants. The synchronous stimulation in the darkskinned rubber hand was demonstrated to have a more positive implicit attitude toward black people and induced a sensation of body ownership. However, the authors observed that the most favorable results of the illusion of the rubber hand were infuenced in the participants with low racial attitudes implied in relation to dark-skinned. Similarly, Lira et al. (2017) revealed that the increase in racial bias implied in relation to dark-skinned affected the temporal dynamics of multisensory integration during the rubber hand illusion and promoted delay in assigning the

sense of body ownership of the hand of another racial group in Caucasian participants. These results together show that social embodiment and recognition of the self and the other are infuenced by the way we are connected to the other, which involves cultural, emotional, and affective aspects.

Finally, perceptive illusions have been shown to be an important tool to manipulate the embodiment of the body itself and the body of the other. Interestingly, the embodiment of the other has helped to understand social processes such as empathy, racial bias, change of the negative valence for the judgment of the other, social perception, among other aspects. The studies have shown us the malleability and the rapid adaptability to the judgment of the implicit social attitude when we experience the body of the other despite the existing cultural differences. Perhaps the advancement of studies of social embodiment allows us to better understand categorization, prejudice, and discrimination from the embodiment of the other and its neural and physiological correlates.

#### **Embodiment of Emotion**

Embodied cognition accounts postulate that there are interrelations between the body (e.g., body posture, gestures) and cognition, and it is assumed that emotions are also embodied. Darwin (1872) observed that physical bodily actions are closely related to an emotional experience and that an experienced emotion seems to result in a particular behavioral pattern. The assumption of embodiment is that we acquire memory and thus knowledge on concrete objects or abstract concepts (e.g., emotions) through experience and store all information of the specifc experience (i.e., context, affect, behavior, etc.) together in a representation (Barsalou, 2008). Sensory experiences from all modalities (motor, sensory, and affective) are stored in these representations. When knowledge is required of a concrete object or abstract concept, the memory stored in its representation can get activated and a simulation of the initial state when the knowledge was acquired takes place in sensory–motor brain areas and can initiate responses across the body, although this can be a partial re-enactment of lesser intensity (Barsalou, 2008; Niedenthal, 2007). Using functional magnetic resonance imaging, Wicker et al. (2003) showed that the same brain region (i.e., insula) is activated in participants when they are seeing a facial expression of disgust as when they are experiencing disgust themselves, demonstrating that the same neural network is involved in the representation as in the experience of this emotion. It is very likely that a triggered representation of an emotion presents itself beyond the neural activation and changes occur across the body.

Representations are indeed not solely localized in the brain but encompass the whole body. Nummenmaa et al. (2014) conducted multiple experiments on the representation of emotions across the body. In one experiment, participants were asked to localize specifc emotions in the body (by coloring in body maps) where the emotion would be felt. In another experiment, emotions were elicited in participants, and they were asked to report the accompanying bodily sensations. In yet another

experiment, participants were asked to link observed facial emotion to parts of the body where the emotion would be felt by the person displaying the emotion facially. The results showed that bodily sensations are linked to discrete emotions refecting the representation of emotion concepts across the body. For example, the emotion of sadness was portrayed as a reduction in bodily sensations in the limbs in line with the lowered muscle tone and drive in activity experienced during sadness. In a further study, Nummenmaa et al. (2018) showed there are neural activation patterns associated with emotional states and demonstrated again that emotions are embodied.

The various aspects stored in a representation of an emotion, i.e., body postures, facial expressions, physiological responses (e.g., pulse), can each trigger the other parts of the representation. As such, verbally reporting about a joyful experience and thereby accessing the conceptual knowledge on joy is likely to activate the representation of joy, leading to the experience of positive affect (that was felt when the situation initially occurred), an associated facial expression of smiling and other physical components. This occurrence has been demonstrated experimentally. Providing participants with one-sentence descriptions of emotional situations and prompting them to imagine the scenario leads to respective subjective feelings and facial muscle activation associated with the emotion imagined (Brown & Schwartz, 1980). Likewise, research has shown that the mere production of a facial expression associated with a specifc emotion can activate the representation of this emotion and lead to subjective experience of said emotion. Hess et al. (1992) asked participants to either feel (to generate the feeling but to keep it inside and not show it) the emotions anger, sadness, happiness, and peacefulness, or to merely express these emotions, or to express and feel the emotions. Self-report ratings of felt emotions were obtained and showed that even the experimental condition of mere production of facial emotional expression led to emotion experience, despite the instruction to not feel and only express the emotion. A recent study further demonstrated that emotions are represented across the body. Participants observed facial expressions of fear and anger while electromyography was recorded from muscles in the face and arm each associated with expressions of fear and anger and the results showed congruent muscle activity in face and arm for the emotions investigated (Moody et al., 2017). Such results demonstrate that individual aspects of conceptual knowledge can activate other parts of the emotion representation including changes across the body.

The literature presented in this section thus far has included an explicit emotional stimulus which activated emotion representations. However, activation of emotion representations also take effect across the body when people are unaware of the activated emotion representation, that is, without explicit emotional stimulus. In a study, participants believed brain lateralization was measured using electroencephalography while they listened to music and were told they had to relax/contract facial muscles as a conficting task (Duclos et al., 1989). However, the facial muscle activation manipulations actually resulted in facial expressions associated with individual emotional expressions and no brain activity was measured. Self-ratings on emotional experience were obtained but covered up as a necessity to control for interference with the obtained electroencephalography recordings. Results showed that facial expression manipulations associated with anger, disgust, fear, and sadness resulted in higher emotional experience reports for each of these emotions. In a second experiment, the body posture of participants was manipulated to represent fear, anger, and sadness and resulted in the respective emotional experiences. Since emotion representations can be triggered without us perceiving an emotional stimulus, it might be the case that elicited emotion representations have further effects, in that bodily states might also affect our behavior and cognitions related to an emotion experience, and this without our awareness.

Embodied aspects of emotions indeed affect cognitions that are related to a current bodily state even in the case that the relationship between the bodily action and the emotion is unknown to a person. Probably the most famous study on the effects of bodily action on cognition was conducted by Strack et al. (1988) who manipulated participants' mouth position and examined the effects on evaluations of cartoons regarding their funniness. When participants were holding a pen with their teeth, sticking out of their mouth, and so unconsciously simulating a smile, participants rated cartoons as funnier than participants holding a pen with their lips in a way that smiling was prevented. The experimentally induced smile was not an expression of truly felt positive affect but elicited the respective representation and could so infuence the evaluations of the cartoons. In both experimental conditions included in the study by Strack et al. (1988), muscular feedback from the face infuenced the evaluations of the cartoons. One explanation is that the experimentally induced smile was perceived by participants as resulting from the cartoons and interpreted as being amused, which is a more cognitive explanation. An alternative explanation, rooted in the body, is that the experimentally induced smile created muscular feedback which elicited the respective emotion representation, and thereby altered evaluations, respectively. This study constitutes one example of how the body can infuence cognitions without being aware of this infuence. However, it should be noted that a multicentre replication study did not consistently reproduce the same results (Wagenmakers et al., 2016). Nonetheless, a preceding study also demonstrated that manipulations of facial expression toward frowning and smiling without participants' awareness affected participants' emotional experience as well as funniness evaluations of cartoons (Laird, 1974). Such fndings further align with aforementioned literature in this chapter on bodily state manipulations related to emotions and respective emotional experiences. It is clear that physical changes occur within the body during emotional experience, but it has also been demonstrated that these changes serve a purpose, in that they prepare for subsequent action, e.g., increased blood fow to skeletal muscles during fear to prepare for fight (e.g., Balters & Steinert, 2017; de Gelder et al., 2004). Consequently, it is no long stretch to assume that bodily states would also affect cognitions, which is the fundamental proposition of grounded cognition or embodiment theories (Barsalou, 2008) and many research fndings support this assumption (Winkielman et al., 2015).

A further example of how embodiment of emotions can affect cognitions provides a study on memory. Participants enacted body postures associated with specifc emotions (but were unaware of this emotion-related manipulation), which facilitated recalling of personal experiences containing these emotions (Schnall & Laird, 2003). In this case, the facilitation of the performance resulted from the congruency between the triggered emotion representation and the emotion in the task. Hence, incongruence between bodily state and emotion stimulus should lead to hampered performance. That bodily states incongruent with an observed stimulus in a task can affect cognitions was demonstrated in a study where facial muscle activations were manipulated and its effect on facial emotion recognition investigated. When participants were holding a pen in their mouth in a way that antagonist muscles to the observed facial emotional expressions were activated, this induced facial muscle feedback that was incongruent with the muscle activation underlying the observed facial expression and lowered recognition rates compared to a control condition without mouth movement manipulation (Wingenbach et al., 2018). These results can be explained with embodiment of emotion. Given that the observation of a specifc facial emotional expression should elicit its representation (including the respective facial muscle activity), motor information incongruent with the visual information, i.e., observed emotion, should cause interference with the elicited representation, and thus hamper recognition. Similarly, an electroencephalography study demonstrated that interfering with the simulation of observed facial emotion through facial muscle activation manipulation impairs processing of the observed facial emotion as evidenced by greater semantic retrieval demand, i.e., larger N400 amplitude (Davis et al., 2017). In a similar fashion, botulinum toxin-a injections in the corrugator muscle of the face (associated with frowning) impaired language comprehension of the emotional content of sad and angry nature (both emotional expressions include corrugator activation) as measured by reading time compared to pre-injection (Havas et al., 2010). These results exemplify the effect bodily states can have on cognitions and highlight the all-encompassing nature of emotion representations across the body.

The relationship between bodily states and cognitions within the framework of embodiment of emotion is bidirectional. That is, cognitive processes related to emotion can have an effect on our bodily states just as bodily states can affect cognitions. For example, participants' posture was measured in vertical height during generation of terms associated with pride and disappointment and a signifcant decrease in height was found during the disappointment condition compared to the pride condition (Oosterwijk et al., 2007). The experience of pride is generally accompanied with a straightened body posture, whereas disappointment usually results in a slumping position and the conceptual understanding of these terms refected in participants' posture. Further evidence for the effect of cognition on bodily states based on embodiment of emotion comes from a study where participants had to pull or push a lever while seeing positive and negative stimuli and were found to push faster for negative valenced stimuli than they pulled and vice versa for positive valance stimuli (Chen & Bargh, 1999). The results can be explained by an evaluation of a stimulus as positive or negative that is embodied in the bodily behavior by facilitating approach for positive stimuli and avoidance for negative stimuli. Similar results were obtained from a study where participants were faster at pulling a slider when the content of a read sentence was positive compared to negative in content and pushed faster for negative compared to positive content (Filik et al., 2015). The literature demonstrates that conceptual knowledge on emotion-related stimuli refects in bodily actions and facilitates corresponding actions.

Interestingly, embodiment of emotions goes beyond the own body and can even refect in our language and the physical space surrounding us. For example, when describing emotional states, an individual that is currently in a sad mood might describe themselves as "feeling down" and an individual in good spirits might "feel elevated." The arousal level and valence associated with both of these affective states (low/negative and high/positive, respectively) refect in the language used to communicate about these affective states. A study asking participants to place negative, neutral, and positive valenced terms within a three-dimensional space found their valence to refect the placement (Marmolejo-Ramos et al., 2018). Words associated with positive valence were placed high up and close to the participants, words of negative valence were placed low and farther away from participants, and neutral words in between. The evaluation of a term as positive vs negative thus affected the vicinity of proximity. It seems that embodiment of emotion does not only entail our own bodies but also the physical space surrounding our bodies.

Neuroscientifc methods can also be used to demonstrate the effect embodiment of emotion has on our cognitions, behavior, and body itself. Price et al. (2012) conducted a study displaying positive and neutral images to participants while electroencephalography was recorded, and the position of participants was manipulated to reclining or leaning. Results showed that the late positive potentials were larger when participants were leaning toward positive images, but no effect of body posture was found during the viewing of neutral images. This study demonstrates that even in the absence of a cognitive task, embodiment of emotion takes effect as specifc body postures modulated brain activity. Such fndings suggest that embodiment of emotion might be the result of primal reactions like approach and avoidance of emotional stimuli taking effect also in higher order processing of emotional stimuli. To conclude this section, embodiment of emotion can be investigated on a behavioral, peripheral–physiological, and neural level, individually or in combination as has been shown in the various parts of this chapter.

#### **Conclusion**

Embodiment is a subject that has broadened the scientifc discussion about the biological system, self-regulation, and neural processes. As shown in this chapter, perceptive illusions have been demonstrated as an important tool to manipulate the corporation of the body itself and the overlap with the body of the other. Interestingly, the embodiment of the other has helped to understand social processes such as empathy, racial bias, change of the negative valence for the judgment of the other, social perception, among other aspects. The studies have shown us the malleability and the rapid adaptability to the judgment of the implicit social attitude when we experience the body of the other despite the existing cultural differences. The

advancement of research of social embodiment has allowed us to better understand categorization, prejudice, and discrimination from the embodiment of the other and its neural and physiological correlates. Moreover, neuroscientifc methods help us demonstrate the similarity in neural patterns during emotional experience and during the simulation of an emotional experience, and can thus provide evidence for the embodiment of emotion. This neuroscientifc evidence is in addition to the vast evidence from behavioral studies on embodiment of emotion, such as embodied emotion expressed through body posture, facial expression, language, and cognitive processes (e.g., stimulus evaluations).

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 4 The Neuroscience of Beauty**

**William Edgar Comfort and Ana Luísa Freitas**

**Abstract** Appreciating beauty is part of everyday life, when we contemplate fne arts, architecture, music, and natural scenes. Aesthetic appreciation, like any ordinary phenomenon of human life, triggers affective and cognitive processes that can provide the subject with sensations of hedonic pleasure and cognitive self-reward (Leder H, Belke B, Oeberst A, Augustin D. Br J Psychol 95(4):489–508, 2004). Although humans share several neuropsychological processes, the experience of aesthetic appreciation is undeniably idiosyncratic, and sometimes it is not that simple to fnd beauty where we were supposed to fnd it, and more often the same object can elicit different reactions amongst observers.

**Keywords** Aesthetic stimuli · Aesthetic appreciation · Halo effect · Default-mode network

#### **Introduction**

Appreciating beauty is part of everyday life, when we contemplate fne arts, architecture, music, and natural scenes. Aesthetic appreciation, like any ordinary phenomenon of human life, triggers affective and cognitive processes that can provide the subject with sensations of hedonic pleasure and cognitive self-reward (Leder et al., 2004). Although humans share several neuropsychological processes, the experience of aesthetic appreciation is undeniably idiosyncratic, and sometimes it is not that simple to fnd beauty where we were supposed to fnd it, and more often the same object can elicit different reactions amongst observers. Let us take the following situation as an example.

I once had a conversation with an artist friend (WC) about Salvador Dalí's Surrealist object, Lobster Telephone (1938). I told her that, though I had a great

W. E. Comfort (\*) · A. L. Freitas

Social and Cognitive Neuroscience Laboratory, Developmental Disorders Program, Center for Health and Biological Sciences, Mackenzie Presbyterian University, São Paulo, Brazil e-mail: 9032936@mackenzie.br

appreciation for Dalí's other work, that particular piece irritated me no end: I saw it as a lazy juxtaposition of two randomly selected items. She replied that the role of art was just that – to provoke; to elicit a reaction, any reaction, even an emotionallynegative one, in the viewer. Although I still hold that initial aversive reaction to Dalí's fve versions of the telephone, my friend was right to emphasise the 'experience' of the artwork rather than its objective attributes, and consequently the subjectivity implicit in both the intensity and valence of my emotional reaction. Beyond the oft-quoted expression that 'beauty is in the eye of the beholder', a Brazilian variant contends that 'he who loves the ugly holds the perception of beauty' (*quem ama o feio, bonito lhe parece*), that is, we are able to distinguish and even empathise with a positive aesthetic experience in another even when we objectively view the object of their desire as ugly. As such, aesthetic experience goes beyond merely a shared and plastic conception of beauty and can elicit conficting and often contradictory mental and emotional states in the same observer (Fig. 4.1).

In light of this perceptual variability in response to aesthetic stimuli, there has been increasing interest in neuroaesthetics, a relatively recent feld which studies the biological bases underlying aesthetic experiences. Such experiences may include the evaluation of facial attractiveness, the appraisal of paintings, sculptures and other works of art, and complex emotional reactions to beauty, either in the natural environment or to man-made structures. Contributions to the study of neuroaesthetics are wide-ranging, drawing from such dispersed disciplines as visual perception, art theory and emotion, and hold important insights for more established areas of study in attention, face recognition, and cognitive ergonomics.

Beauty can be defned as a property or value of an object, natural scene, or person which engenders a physiological and psychological experience of pleasure and satisfaction. Cognitive neuroscience as a whole is well-placed to provide greater understanding of how humans form and process the experience of beauty, from a predominantly dopaminergic network for encoding hedonic value to the effect of context and long-term memory on the modulation of our individualised experience of beautiful stimuli. Beauty is often predominantly measured in terms of positive affective response to aesthetic stimuli, such as paintings, physically attractive faces and natural scenes, or even the activation of a subset of distinct brain regions, such

**Fig. 4.1** Salvador Dalí's lobster telephone (1938)

as the orbitofrontal cortex, a frequent criticism of scientifc reductionism levied by colleagues in the humanities (Brown & Dissanayake, 2009).

Within the feld of neuroaesthetics, however, the aesthetic experience is not merely reduced to the perception and appreciation of beauty however the global affective and cognitive valuation of external stimuli, either for their artistic or other intrinsic qualities. Pearce et al. (2016) effectively distinguish between the cognitive neuroscience of art and aesthetics, arguing that an appropriate conceptualisation of neuroaesthetics must consider artistic stimuli not merely in aesthetic terms but through a broader context of modulating factors, such as expertise, perceived value, and complex emotional states.

In a series of groundbreaking syntheses, Anjan Chatterjee, Oshin Vartanian, and colleagues (Chatterjee, 2011, 2014; Chatterjee & Vartanian, 2014, 2016; Pearce et al., 2016) have set forth the key aspects within the neuroscience of aesthetics, in what they denominate the triad of aesthetic experience: distinct brain networks for sensory-motor, emotion-valuation, and meaning-knowledge functions in the appraisal of aesthetic stimuli (see Fig. 4.2).

#### **Defning Neuroaesthetics**

Any workable defnition of neuroaesthetics must proceed from work in perception, more especially visual perception, given the primacy of this sense in human perception (Kupers et al., 2011). To this end, Ishizu and Zeki (2011) investigated whether similar patterns of brain activity were correlated across different sensory modalities. Their hypothesis, in accordance with Burke's (2014) assertion of a single representation of beauty across different sensory modalities, was that a similar region of the medial orbitofrontal cortex (OFC) would be activated in response to aesthetic stimuli from both a visual and auditory source. Stimuli classifed through ratings as 'beautiful', 'indifferent', and 'ugly' were presented as pairs in an aesthetic judgement task and a control brightness judgement task. The contrast between activation in the aesthetic > brightness task revealed signifcant activation in the lateral and medial OFC and superior frontal gyrus, indicating a selective OFC response to aesthetic judgement.

A neural system for aesthetic perception has been posited as relying on the same underlying brain structures as those for emotional processing as well as a generalised object-appraisal system (Brown et al., 2011). According to this view, a 'naturalization' of neuroaesthetics is required, in which the neural bases of aesthetic perception are more directly linked to the basic valuation of sensory stimuli, specifcally mapped to the gustatory cortex, consistent with the identifcation of the anterior insula as the most concordant area of activation between studies of aesthetic judgement (Brown et al., 2011), an area more commonly associated with the valuation of taste.

Reber et al. (2004) proposed that aesthetic pleasure is related to the fuency with which an observer can process the characteristics of a given object. The easier the integration of the plastic elements, the greater the congruence construction, so the more fuent this observation can be, the greater the chances of positive attribution to what one observes. The fuidity of this perceptual dynamics is associated with the elicitation of pleasure, and negative reactions are common when observing asymmetric or disharmonic combinations (Ikeda et al., 2015). An absence of or interruption to this fuidity may have been one of the main contributions to our sometimes negative affective experiences with aesthetic stimuli, as in the author's reaction to Dalí's Lobster Telephone described at the beginning of this chapter.

In a study with functional magnetic resonance imaging (fMRI), to understand the neural mechanisms underlying the aesthetic and emotional aspects of colour perception, Ikeda et al. (2015) verifed activation of the left medial orbitofrontal cortex (mOFC) during the observation of visually congruent stimuli and the left amygdala during the observation of incongruent stimuli (Ikeda et al., 2015). Their results led them to conclude that stimulus valuation is conditioned by automatic visual processes of stimuli features mediated by the amygdala, and the aesthetic values measured by the mOFC. These results suggest that differences in appraisal aspects of object valence may be due to separate brain regions.

Additionally, while studying different behavioural and electrophysiological responses to aesthetic experiences with modern art, Pihko et al. (2011), Leder et al. (2014), and Else et al. (2015) suggested that the observers' backgrounds should be considered when interpreting differences in response, as both the expertise level of the observer and semantic content, such as labels and titles (Gerger & Leder, 2015), can interfere in the implicit evaluation of art. For example, Gerger and Leder (2015) found varying activation in the corrugator muscle of the eyebrow and the zygomatic major facial muscle of viewers according to whether the artwork is titled or untitled, whether semantically congruent or not. These different, often discrete, muscle activations can be recorded by facial electromyography (fEMG), and the data collected by Gerger and Leder (2015) suggests that when the title was absent or incongruent, observers reported less interest and subjective aesthetic appreciation.

Although the results of Gerger and Leder (2015) appear to corroborate the relationship between the valuation of experience and perceptual fuency (Reber et al., 2004), this pattern of response may be more directly associated with the fact that study participants may have been infuenced mainly by the automatic emotional processes underlying perceptual fuidity, as previously proposed by Brown et al. (2011). Corrugator and zygomatic contraction is associated with greater cognitive effort and, therefore, it is also assumed to be refective of a reduction in perceptual fuency and consequent increase in negative experience reports. What the authors point out, however, is that fuidity of processing in the conjunction of characteristics and contextual information alone is not suffcient for positive aesthetic evaluations, as often moderate levels of cognitive effort can contribute to the positivity of aesthetic experiences, a fact observed in cases of expert evaluations that focus on art in a more elaborate way (Gerger & Leder, 2015).

#### **Motivation and Facial Attractiveness**

Another fundamental aspect of beauty research concerns its importance for the maintenance of the species. Evolutionarily, attractive faces constitute an important factor both for the establishment of sexual relations (reproductive ends) and for parental behaviour (Hahn & Perrett, 2014), because they usually symbolise fertility, gene quality, and health (Chatterjee & Vartanian, 2016).

Thinking in reproductive aspects, in heterosexuals, there is the activation of a neural network involved in motivation and reward systems, involving structures such as the nucleus accumbens, the medial prefrontal cortex, the anterior dorsal cingulate and the orbitofrontal cortices, which shows a high level of response to attractive rather than unattractive faces of people of the opposite sex. Amongst both heterosexuals and homosexuals, Hahn and Perrett (2014) point out that there are comparative studies in the literature that indicate greater activation of the orbitofrontal cortex and dorsomedial thalamus when observers are presented with faces of people of the desired sex, regardless of sexual orientation or even the gender of the observer. These results are consistent with previous research by Ishai (2007), who found greater OFC activation in response to attractive male faces in heterosexual women and homosexual men and greater OFC activation in response to attractive female faces in heterosexual men and homosexual women.

Similar to mate bonding behaviours amongst adults, the attractiveness of an individual infant's face appears to infuence both caregiver behaviour and the quality of care (Langlois et al., 1995). Cute or attractive children are more likely to receive care and positive affects, such as tenderness, and less likely to suffer aggression (Hahn & Perrett, 2014), which is possibly related to the fact that cute children are usually seen as healthy and worthy of parental investment. In an fMRI study,

Glocker et al. (2009b) found that the baby schema, i.e., 'a set of infantile physical features' (Glocker et al., 2009a), activates the nucleus accumbens (NAcc), a key structure of the reward system, in nulliparous women.

Sexual and gender differences in neural responses modulated by attractive faces and infant cuteness must be better investigated, but what we know so far leads us to the hypothesis that baby facial attractiveness has its infuence in human caregiving, regardless of kinship, and attractiveness in adult faces continues to play a key role in mating bonds.

#### **Beauty and the Beast? Aesthetics and Complex Emotions**

The evaluation of aesthetic stimuli is not only related to aspects of positive or negative valuation or reward processing. It is often related to the development of complex emotions such as awe, envy, and anxiety (Armstrong & Detweiler-Bedell, 2008), closely linked to the perception of the sublime in art, a philosophical concept underpinning our understanding of subjective emotional responses to aesthetic stimuli (Chatterjee & Vartanian, 2016). In a fMRI study, Cupchik et al. (2009) found signifcant bilateral activation in the insula when participants viewed artworks, consistent with an emotional aspect in aesthetic evaluation, suggesting that the aesthetic experience emerges from a top-down guiding of attention and bottom-up perceptual cues for fuency and visual organisation.

One key network in the subjective evaluation of aesthetic stimuli is the defaultmode network (DMN), implicated in mind-wandering, autobiographical memory, and other processes. Vessel and colleagues tested fMRI response in participants rating artworks on how 'moving' they perceived them to be, on a scale of 1–4 (Vessel et al., 2012). They found an increase in activation in several areas within the DMN, including the anterior medial PFC, lateral orbitofrontal cortex, posterior cingulate cortex, and hippocampus, in response to more emotionally 'moving' artworks but, importantly, only the most highly rated images for emotional resonance led to an activation in the aMPFC, in contrast to other studies (Kawabata & Zeki, 2004; Ishizu & Zeki, 2011) which found activation varied linearly in response to the emotional response to aesthetic stimuli. This pattern of results suggests that a 'sublime' experience to aesthetic stimuli depends on an intense emotional reaction and corresponding activation in the anterior medial PFC in conjunction with the more commonplace aesthetic valuation occurring primarily in the OFC.

#### **Judging Books By Their Covers**

Despite its prevalence and the automaticity with which it occurs, the so-called 'halo effect' is a frequently employed cognitive bias that guides and directs our decisionmaking and judgement, causing relevant aspects to be relegated and others, such as aesthetics, to stand out and interfere with one's general judgement, even when there is suffcient information available to form an independent evaluation of extraneous salient attributes (Nisbett & Wilson, 1977).

Individuals whose faces are more attractive are often judged more positively in various dimensions and also receive different treatment in different domains of social life (Liang et al., 2010). The evaluation and personality traits attributed to facial attractiveness seem to have emerged as an adaptive response, in the light of evolutionary hypotheses that misshapen faces were associated with parasites, disease, and a lack of biogenic immunity. This evolutionarily driven preference for attractive, unblemished faces affects even otherwise healthy individuals, who nevertheless have a facial asymmetry conveying inferior levels of health, intelligence, and sociability (Zebrowitz et al., 2002; Liang et al., 2010).

#### **Conclusion**

The perceptual, behavioural, and neural mechanisms involved in the perception of aesthetic stimuli are key to our understanding of the everyday interactions and motivations which characterise our relationships with the natural and man-made world. The aim of understanding the cognitive processes underlying aesthetic appreciation has been a key driver for applying the tools and methods of cognitive neuroscience to the feld of art and aesthetic stimuli and establishing links with the parallel study of perceptual, emotional, attentional, semantic, memory, and decision-making processes. Investigating neural networks engaged in decoding and valuing aesthetic content is an important challenge that should involve other areas of knowledge. For the phenomena of aesthetic appreciation to be understood in all their complexity, it is necessary to integrate into the physiological aspects also the historical, social, and cultural aspects, which holistically make up the person.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Part II Social Neuroscience and Moral Emotions**

## **Chapter 5 Mirror Neurons in Action: ERPs and Neuroimaging Evidence**

**Alice Mado Proverbio and Alberto Zani**

**Abstract** According to V.S. Ramachandran (inaugural 'Decade of the Brain' lecture at Society for Neuroscience meeting), 'mirror neurons are to neuroscience what DNA was to biology'. Their discovery (by Rizzolatti's group) led to the understanding of how hominids rapidly evolved through imitation and cultural transmission in the last 100,000 years. In this chapter, we will review the role of human mirror neuron system (MNS) in several mental and brain functions including: interacting with the environment, grasping objects, empathy and compassion for others, empathizing, emulation and emotional contagion, observing and imitating, learning sports, motor skills and dance, motor rule understanding, understanding the intentions of others, understanding gestures and body language, lip reading, recognizing actions by their sounds, learning to play a musical instrument. The chapter is enriched with a discussion of possible criticalities and caveats.

**Keywords** Mirror neuron system · Empathy · Audio-visuomotor neurons

#### **Introduction**

Many neuroimaging studies have searched for the human correlate of the monkey 'mirror neuron system' (MNS) and tried to isolate mirror neurons by using noninvasive imaging techniques such as fMRI (e.g. Dinstein, 2008; Iacoboni, 2005; Iacoboni et al., 2005; Schmidt et al., 2021). The results provided evidence of a strong activation of the anterior intraparietal area (AIP) and the ventral premotor areas (F5) when subjects passively observed others performing movements, actively executed movements themselves, or imitated movements made by others. In addition, AIP and F5 areas were frequently found engaged in tasks involving empathy,

A. M. Proverbio (\*)

A. Zani School of Psychology, Vita-Salute San Raffaele University, Milan, Italy

Department of Psychology, University of Milano-Bicocca, Milan, Italy e-mail: mado.proverbio@unimib.it

social cognition, and theory of mind, along with the inferior frontal gyrus, inferior parietal cortex, fusiform gyrus, posterior superior temporal sulcus, and amygdala. These data seem to suggest the existence of a shared neural mechanism for social cognition.

Despite the long line of research, studies on the human MNS still suffer from some severe methodological problems. Electrophysiological single unit recordings, which are required for a clear-cut demonstration of mirror neurons properties, are not feasible in humans. Therefore, the majority of studies approaching MNS in humans rely on methods with low temporal resolution (e.g., fMRI), which is an indirect method based on blood oxygenation signal (BOLD) and not directly measuring neuronal activity (see also the paragraph devoted to pitfalls at the end of the chapter). In this regard, ERPs can be excellent tools for providing the necessary temporal resolution for studying action and gesture recognition processes in healthy humans.

#### **Visuomotor Neurons and Action Encoding**

Mirror neurons (MNs) were frst discovered in the ventral premotor cortex (PMv cortex) of the macaque monkey (F5 area) by Rizzolatti and Luppino (2001). These neurons were activated both when the animal performed a specifc motor action and when observed another simian or human individual performing that same action. The MNs do not respond to the simple presentation of food or other objects that also affect the animal, nor they are activated by the observation of a mimed action without the presence of the objects. In order for the MNs to activate (or 'to fre', i.e., to show an intense discharge frequency), an actual interaction of the hand with a target object of the action is essential. Despite being motor neurons, MNs are not activated by single movements (e.g. of the fngers) comprising a whole motor act, but, like all the other neurons in the PM cortex, are instead activated in association with goaldirected and purposeful motor actions. The MNs are stimulated by the execution/ observation of motor actions performed with the hand, but also with the mouth. They are very sensitive to the type of grip (i.e. precision grip, power grip, grip of small or large objects, grip of little seeds, etc.) and encode the *actions goal*. For instance, the neural micro-population that encodes the gesture of taking an apple will not be the same if its purpose is to eat it (i.e. the animal takes the apple and then brings it to the mouth), or to throw it away (i.e. the monkey takes the apple and throws it). After the discovery of MNs in the premotor cortex (PM), other studies have shown their presence in the inferior parietal lobe (IPL), in particular in the rostral portion of this brain lobe. These neurons would be more involved in the representation of the actions associated with an object or a tool of which they process the motor properties (i.e. *affordances*), such as, for instance, its graspability and/or usability, while dealing with information coming from the fronto-parietal-occipital visual ventral stream (VVS).

Many studies have shown that the mirror neuron system (MNS) is also present in humans (Rizzolatti & Craighero, 2004; Rizzolatti & Sinigaglia, 2016). Fine examples of these studies are the EEG investigations on the reactivity of brain rhythms during actions observation. Many studies have shown that the sight of actions performed by other individuals (with hands, legs, fngers, etc.) induces a block of observers' sensory-motor EEG rhythm (or so-called *mu* rhythm) recorded at scalp sites, which would refect a state of relative inactivity in the Rolandic region (e.g. Lelord et al., 1998). An important PET study on human volunteers is the one carried out by Rizzolatti and colleagues (Rizzolatti et al., 1996), which allowed a frst localization of the areas involved with the MNS during the observation of grasping movements. Volunteers were tested in three different conditions. In the frst, they observed grasping gestures of common objects performed by the experimenter; in the second, they proceeded to reach and grasp the objects themselves, while in the third, they simply observed the objects. The results showed that only action observation activated signifcantly the inferior parietal lobule (IPL) and the ventral premotor area (PMv) together with the posterior portion of the inferior frontal gyrus (IFG) (Fig. 5.1).

Other studies have shown that the MNS is not only activated at the sight of gestures but also of manageable objects. By means of fMRI evidence, Creem-Regehr and Lee (2005) demonstrated that graspable tool shapes activated motor-related regions of the cortex, including the PMv area and the posterior parietal cortex

**Fig. 5.1** Adjusted mean regional cerebral blood fow recorded by Rizzolatti et al. (1996) during grasping observation. The data are displayed as statistical maps overimposed on three planar projections (sagittal, coronal, and transverse) frames and as cortical rendering of the lateral cortical surfaces of the left hemisphere. The pixel values signifcantly higher than *p < 0.001* are shown in red

(PPC). The event-related potentials (ERPs) study by Proverbio et al. (2011a, b) provided the possible time course of this activation showing that the earliest neural tool/non-tool discrimination was indexed by an increased anterior negativity in the 210–270 ms post-stimulus latency range in response to tools rather than to objects. Source reconstructions for these fndings highlighted the contribution of left-sided brain premotor and somatosensory cortices, possibly including the anterior intraparietal sulcus (aIPS). Further studies demonstrated that the cortical representation of actions (especially tools manipulation and use) is asymmetrically represented over the left hemisphere. Indeed, a lesion of the left inferior parietal cortex (IPC, BA40) is often associated with apraxic defcits, whilst a right-sided lesion rarely causes these defcits (Goldenberg & Spatt, 2009). The question of whether this hemispheric asymmetry depends on right-hand use or a hemispheric functional specialization for fne-grained, precision movements has been explored in another ERP study by Proverbio et al. (2013). The authors recorded ERPs to pictures depicting unimanual (e.g. a hammer) or bimanual (e.g. a bicycle handlebar) tools, while participants were instructed to respond motorically to infrequent images of green plants (Fig. 5.2). A prefrontal N400 component (elicited by non-targets) was much larger over the left scalp sites to bimanual than unimanual tools. swLORETA (acronym for *standardized weighted LOw-REsolution electromagnetic TomogrAphy*) sources reconstruction revealed that besides the left and right parietal cortices (BA39,

**Fig. 5.2** Examples of pictures depicting bimanual and unimanual tools used as stimuli in Proverbio's et al. (2013) ERP study

BA40), tools observation always activated the left premotor cortex (BA6) regardless of the hand involved in their manipulation/use. Overall, these data suggest that looking at tools automatically activates mental representations associated with their manipulation, with a left-sided hemispheric asymmetry for this brain activation.

#### **Mirror Neurons and Understanding the Intentions of Others: Empathy**

An fMRI study by Iacoboni et al. (2005) has robustly demonstrated that the activation of visuomotor MNs makes it possible to share behavioral goals and to understand other people's intentions (a multifaceted capacity called *mentalizing* or *theory of mind*). In this famous experiment, participants observed three types of stimuli: grasping actions without context (the box in the middle in Fig. 5.3), the context without actions (the left box in Fig. 5.3), and manual actions performed in two different contexts (the right upper or lower boxes in Fig. 5.3). In this last condition, the context suggested the intention associated with the grasping action (i.e. drinking or clearing). Actions associated with the specifc contexts produced a signifcant increase of the bold signals at the back of the IFG and in the PMv, being part of the MNS. Furthermore, the activation revealed to be greater for drinking (biologically more relevant) than for clearing. These data showed how these regions, active

**Fig. 5.3** Types of stimuli used in Iacoboni's et al. (2005) study. The same action (e.g. to take a mug) reveals an agent's different intention according to the context in which he/she is. Such an intention is encoded and inferred by means of fronto-parietal MNS activation. (Courtesy of Marco Iacoboni)

during the execution and observation of an action, were also involved in understanding the intentions of others.

For its ability of understanding visual gestures and their aims, the fronto-parietal MNS is involved in a multiplicity of mental functions including:


It has been shown how recognition of body language, both symbolic and affective, as well as the congruence of people's gestures are strongly related to the frontoparietal MNS receiving and processing information from brain regions specialized in recognizing faces (i.e. fusiform face area or FFA), facial expressions (i.e. FFA and superior temporal sulcus, STS), and bodies (i.e. extrastriate body area, EBA). In a series of electrophysiological studies by Proverbio et al. (2010, 2014a, 2015a), visual ERPs were recorded in different samples of volunteers viewing hundreds of images depicting actors and actresses mimicking a symbolic gesture (iconic, deictic, or emblematic, such as, for instance, those in Fig. 5.4, top) or an emotional display of mood using body language (as shown in Fig. 5.4, middle), or using a tool (Fig. 5.4, bottom). In half the cases, the scene was incongruent with its verbal description and/or with respect to pragmatics or standard knowledge about tools use. In all cases, the perception of incongruent images (from the points of view of the gesture or of the action meaning and/or aim) elicited a wide negative response (i.e. N400) tending to be larger at anterior scalp sites. Applying swLORETA inverse solution to the N400 potential (within its time window of occurrence), it emerged that the incongruity between actions and their presumed intentions stimulated the activation of slightly different neural circuits in the three conditions (certainly more emotional in the case of body language; Fig. 5.4, middle), but invariably including the inferior regions of both the frontal premotor and parietal areas (i.e. the frontoparietal MNs system, in addition to the anterior cingulate cortex (ACC), the superior temporal cortex (STC), and the visual (FFA and EBA areas).

All in all, these data suggest how the MNS underpins the ability to recognize the intentions of an agent, through the observation of a gesture and the motor simulation of that same gesture by an observer.

**Fig. 5.4** Examples of congruent (left column) and incongruent (right column) stimuli used in Proverbio's et al. studies (2010, 2014a, 2015a), associated with the N400 component electrophysiological effect (third column), refecting the violation of an expectation related to the aim of the action or of the gesture expressed by the actors, as referred to a shared grammar of gestures or to the context and to the pre-established use of a tool. The N400 effect is drawn as a red continuous line in the upper waveforms, as a blue continuous line in the middle waves, and as a red dotted line in the lower waves (where ERPs are shown in red for women and in blue for men). (Reproduced and modifed with the permission of the authors)

#### **Observation and Imitation**

The ability to imitate the gestures of others, either unconsciously (e.g. as in yawning or in posture, such as crossing the legs) or consciously (e.g. when we imitate the master's gesture to successfully learn to play tennis) is strongly based on the MNS. The imitative ability of yawning, for example, has been investigated by Usui et al. (2013) in a study in which children with autistic spectrum disorder (ASD) and/

or typically developing children were shown yawning (i.e. the face of a yawning woman) vs. control frames (i.e. the face of a smiling woman) while watching a cartoon. To ensure participants' attention to the face, an eye tracker controlled the onset of the yawning and of the control stimuli. Results demonstrated that both ASD and control children yawned more frequently when they watched the yawning stimuli than the control stimuli (without any signifcant group differences). It was therefore suggested that the absence of contagious yawning in children with ASD, as reported in previous studies, might have been related to their weaker tendency to spontaneously attend to others' faces.

The link between action production and observation has also been explored in 'automatic imitation' or 'visuomotor priming' paradigms, where participants perform an action that is either congruent or incongruent with an observed movement. If action observation and action production employed shared mechanisms (namely, mirror neurons, Iacoboni, 2005), performing an action that is compatible with the observed action should lead to facilitation, while performing an action that is incompatible with the observed action should result in an interference effect. This pattern of results has been widely documented. For example, Craighero et al. (1996) primed healthy subjects, while ready to execute a grasping movement, by visually presenting them with drawings irrelevant to the task to be executed. Drawings visually congruent with the object to be grasped markedly reduced the response times, thus facilitating grasping actions, and vice versa. This study provided one of the frst evidences for the existence of a visuomotor priming.

When we observe others, the motor and sensorimotor systems are activated to process and simulate the observed gesture. This activation induces the desynchronization of EEG *mu* rhythm (i.e. an oscillation rhythm of 8–12 Hz with a centralparietal topographic distribution over the scalp) refecting a state of relative inactivity of the Rolandic region, a kind of *stand-by* from the motor or somatosensory processing. Therefore, its desynchronization indicates an activation of the neurons of this same area, committed to coding an observed or performed action, and can be used to measure MNs activity in both human adults (Pfurtscheller et al., 2006) and infants (Nyström et al., 2011). For example, Proverbio (2012) provided evidence that watching manipulable objects automatically activates their motor properties as indexed by the EEG desynchronization of *mu* rhythm over centro-parietal scalp sites during perception of tools vs. non-manipulable objects. Other studies have shown a lack of reduction of event-related beta and mu desynchronization (ERD) in ASD children during perception of actions, as opposed to comparable ERD responses during action execution (Oberman et al., 2008). Interestingly, Van Elk et al. (2008) showed that the longer is the motor experience of infants with crawling, the stronger is the *mu* rhythm desynchronization during observation of other children's crawling. This piece of fndings indicates that experience strongly modulates MNS responsivity. As proof of this, it has been shown that the skills acquired in a certain athletic or sporting discipline, or, for instance, in dance, strongly modulates MNS responsivity. Proverbio et al. (2012) compared EEG/ERP signals relative to the visual processing of actions that violated basketball rules (e.g. in defense, blocking, and shooting actions) with that of correct basketball actions in professional

basketball players and controls. They found that incorrect actions elicited anterior N400 responses refecting the automatic detection of action incorrectness only in professional players (see ERP waveforms in Fig. 5.5). According to source reconstruction, N400 generators included the fronto-parietal MNS, the cerebellum, the EBA, and the STS. Similarly, the detection of incorrect dance gestures has been shown to elicit a response in the fronto/parietal MNS circuits in professional dancers vs. controls (Calvo-Merino et al., 2005; Orlandi et al., 2017).

**Fig. 5.5** Grand-average ERPs recorded in professional basketball players (**a**) and naïve viewers (**b**) in response to correct and incorrect basketball actions at frontal, parietal, and occipital scalp sites. (Taken and redrawn from Proverbio et al., 2012)

#### **Audio-Visuomotor Neurons**

The existence of multimodal audiovisual cortical regions has been demonstrated both for phonetic/articulatory language (i.e. verbal language) and for human and animal vocalizations (e.g. a chirp, a whinny, a cry, a laughter), as well as for encoding of noises typically produced using objects (e.g. the noise produced by crushing nuts, or by chewing). These multimodal neurons are a particular class of MNs that encode both visual and auditory information.

#### *Audio-Visuomotor Neurons in Language and Vocalizations*

The existence of a link between motor and perceptual representations of language has been since long demonstrated. According to Liberman's theory (Liberman & Mattingly, 1985), knowing how to understand a phoneme would strictly correspond to how to pronounce it. For example, in a fMRI study on healthy subjects, Pulvermüller and Shtyrov (2006) found that, while listening to bilabial (/ p /) and dental occlusive phonemes (/ t /), simultaneous activations were observed of both auditory areas of the temporal lobe (for understanding) and of the precentral motor areas (for production), with a difference in the locus of activation depending on the processed phoneme: at the motor representation of the lips, for / p / e of the language for / t /. Fadiga et al. (2002) recorded motor evoked potentials (MEPs) from the muscles of the tongue in participants who had been asked to listen to acoustic stimuli. These stimuli consisted of words or pseudowords containing a double / f / (e.g. *baffo* (i.e. *moustache*) in Italian) or a double / r / (e.g. *birra* (i.e. *beer*) in Italian) and bitonal sounds. The / f / is a labiodental consonant that, for being pronounced, does not require a particular involvement of the tongue, while the / r / is a linguopalatal consonant that involves a marked involvement of the tongue for its pronunciation. The results of the experiment showed that listening to words and pseudowords containing the double / r / resulted in a signifcant increase of the MEPs, compared to the case of bitonal sounds, words, and pseudowords containing the double / f /. As a whole, these data demonstrated that, in humans, a MNS would exist dedicated to the comprehension of linguistic sounds (i.e. an *echo mirror system*): when an individual listens to verbal stimuli, an automatic activation would occur of motor centers responsible for the emission of the phonemes present in the words heard. These data are highly consistent with other fndings deriving from fMRI investigations. Wright et al., (2003) evaluated whether speech accompanied by both auditory and visual information (as it normally does), induced a higher activation of STS, compared to speech associated only with mono-sensory information. In this study, the volunteers watched an actor speaking in three different conditions: audiovisual speech, auditory speech, and visual speech. The STS was strongly activated in all conditions, but, above all, and in a super addictive way, in the audiovisual condition; apparently, these results confrmed the multisensory nature of the STS.

**Fig. 5.6** (Above) Examples of stimuli used for the study on neurons sensory preference (i.e. the face of a conspecifc emitting a vocalization vs. the opening and closing of a disc without any facial stimulus). (Below) Bioelectrical responses displayed by a multisensory cell of the associative auditory cortex of the macaque monkey. Note that the response to the combined voice and face conditions (red line) is far superior than the uni-sensory stimulation (in this case, the response to the incongruous coupling between disk and voice that did not stimulate the cell enough is also drawn as a yellow line). (Adapted from Ghazanfar and Schroeder (2006). Courtesy of the authors)

A very similar, but more direct demonstration of the existence of audio visuomotor neurons derives from a single-cell recording, neurophysiological study carried out by Ghazanfar and Schroeder (2006). The authors identifed neurons in the STS that not only responded to faces or voices but also exhibited a far greater responsivity to the audiovisual association, thus demonstrating their multisensory specialization (Fig. 5.6).

#### **Audio-Visuomotor Neurons and the Sound of Objects**

In a famous study by Kohler et al. (2002), published in *Science*, it was demonstrated that the brain retains specifc neural representations of the actions performed on objects (e.g. beating eggs, hammering) and of the sounds typically produced by their use. Congruently, the research group coordinated by Giacomo Rizzolatti discovered neurons in the PMv of the macaque monkey that 'fred' both when the animal performed a specifc action and when it only heard its sound. Most neurons also fred when the monkey simply watched an action. These audiovisual MNs encoded the actions regardless of whether they were performed, listened to, or simply seen; altogether, these observations led to the discovery of the audio-visuomotor MNs. Besides the PMv cortex, hosting the audio-visuo-motor MNs, there are interesting audiovisual neurons that conjointly encode the objects and the sound they produce (which, of course, reveal of fundamental importance for music learning and for the regulation of sensory feedback). Many neuroimaging studies have long shown the existence of multisensory audiomotor neurons in the posterior region of the STS and in the middle temporal gyrus (MTG) that respond to the sounds and visual images of objects and animals. The data showed how these regions are activated more strongly by audiovisual stimuli than by uni-sensory stimuli, thus suggesting the crucial role of these regions in the multi-sensory integration of inputs coming from the two modalities (see, for instance, Beauchamp et al., 2004a, b, and Tranel et al., 2003). For instance, Beauchamp et al. (2004a, b) explored how the brain integrated visual and auditory information related to familiar animals and objects, presenting them individually or in association with each other, by means of fMRI scanning of cerebral activity in a sample of participants. Their fndings clearly showed the existence of multisensory systems simultaneously encoding visual and auditory features linked to an action, such as a phonatory gesture of an animal or the manipulation of tools (see Fig. 5.7).

Because of the repeated association between an object and its typical sound, and of the fact that the brain represents the so-called *object-sound knowledge*, we can activate the image of a sound based on the object's view. It is for this reason that a musician can visually recognize the sound associated with a gesture or knows how to predict the sound that will be emitted, before it is played, observing, for instance, the tension of the hair of a bow, the position of the fngers on a keyboard, or the key pressed down.

In an electrophysiological study by Proverbio et al. (2011b), it was shown that the only view of objects or actions associated with a sound can activate brain temporal cortex, a region overseeing auditory perception. In this study, high-density ERPs were recorded in 15 students who were required to look at hundreds of images associated with a given sound or to silence (see Fig. 5.8 for some examples of stimuli). ERP signals analysis showed that, despite stimulation being only visual, sound-related stimuli were distinguished from non-sound-related stimuli already after only 110 milliseconds post-stimulus processing. According to the authors, this happened because perception and recognition of objects, agents, and stimuluscontexts stimulated the access to conjoined auditory information. Indeed, as it was well known to silent movies flmmakers, there is no need for a real auditory stimulus to activate the sensation of hearing sounds typically associated with what we are seeing: This is how in a silent movie you will almost hear the whistle of the steam train or his rattling on the tracks.

**Fig. 5.7** Visual stimulation consisted in the silent presentation of pictures of animals and tools while the auditory stimulation consisted of the blind presentation of their verse or typical sound. The audiovisual stimulation involved the integration between the two modes. Brain images show the BOLD signals of neurometabolic activation obtained by fMRI in the various stimulation conditions. Note that the audiovisual condition activated the multimodal prefrontal regions, as well as the motor and premotor cortices, the posterior region of the STS, and the MTG. (Drawn and modifed by Beauchamp et al. (2004a, b). Courtesy of the authors)

#### *Audio-Visuomotor Neurons in the Coding of Musical Actions and Sounds*

While investigating how professional pianists could identify the musical piece performed in silent scenes by looking at the movements of the musicians' hands on the keys (i.e. looking at actions performed on objects), Hasegawa et al. (2004) hypothesized that visuomotor representation of musical gestures was strictly associated with the auditory representation following a specifc learning. In this study, seven participants without any musical experience (control group), ten participants with some experience of the piano (not very experienced), and nine professional pianists were tested. During fMRI scanning, the participants observed silent videos showing bimanual movements of a pianist pressing the keys of a piano keyboard (Fig. 5.9a: Right), or, in a basic condition, only random, sliding across keyboard, key touches

**Fig. 5.8** Some examples of 'sound' (top) and *"*silent' (centre) visual stimuli presented together with other hundreds of stimuli to unaware observers, instructed to detect and respond to infrequent images of cycling races. The analysis of ERP peaks, together with the reconstruction of their intracerebral generators by means of the swLORETA technique, demonstrated the activation of the left medial temporal cortex after only 110 ms from the presentation of the image. The extraction of sound information associated with the use of familiar tools after ~200 ms activated the primary (BA38) and secondary (BA41) auditory cortices. This information is responsible, for example, for auditory hallucinations, which, in this case, refer, in a dim way, to the call of the specifc sound produced by the tool (in the fgure, the sounds produced by the sax or by the infernal chainsaw). (Taken from Proverbio et al. (2011b). Courtesy of the authors)

(Fig. 5.9a: Left). Pressure movements could be completely random, that is, not at all combined with a musical piece or related to the execution of a more or less famous piece. Professional pianists were able to identify these pieces, but, above all, the

**Fig. 5.9** (**a**) Examples of visual stimuli used in the study by Hasegawa et al. (2004) (**b**) Activation of the left temporal region as a function of musical performance in the three groups of participants. (**c**) fMRI activations in response to an exclusively visual stimulation in the brain of professional pianists. (Courtesy of the authors)

view of the musical performance – regardless of the piece – activated their frontoparietal MNS (i.e. motor simulation) and STS, thus demonstrating that seeing familiar musical gestures activates the stored memory of the associated sounds, but only in those who actually know how to perform them. This study clearly demonstrated the role of audio-visuomotor neurons in musical learning (Paraskevopoulos et al., 2012; Schulz et al., 2003).

A similarly interesting study on audio-visuomotor coding is the one carried out by Lahav et al. (2007). In this study, naïve participants (i.e. non-musicians) were trained to play a short musical sequence by ear. Their cerebral activity was then tested by means of fMRI while they listened to the newly learned piece. The authors found that, despite the participants not making any kind of movement while listening, both motor and mirror regions were activated, including the bilateral frontoparietal motor circuit, along with the IFG and the PMv, the IPS and the IPG. Moreover, the presentation of the same musical notes organized in a different order, activated in a much less measure the same regions, whereas listening to a familiar musical sequence whose motor program was unknown, did not activate these regions at all. These data supported the hypothesis of the existence of a "*hearing-doing*" (or "*hearing-action*") system, strongly dependent on the individual's motor's repertoire. In this regard, with a study combining Transcranial Magnetic Stimulation (TMS) and MEP recordings, Candidi et al. (2014) showed that, in expert pianists, the observation of a piano fngering error – a visual gesture shown without any audio – induced a signifcant motor effect, and in particular a somatotopic corticospinal facilitation concerning the fnger of the hand engaged in the fngering error. Together, the studies described above demonstrated how learning of skilled gestures characterized by a complex timing applied to a given musical instrument (or to a vocal performance) occurs through the progressive and long-term association between motor, somatosensory, and auditory functional patterns, namely through a substantial audio-visuomotor coding of the musical gesture, which takes many years.

A cross-sectional study by Proverbio et al. (2015b) investigated how the representation of musical sounds changed as a function of the years of study in relation to the motor gesture necessary to produce these sounds. This study considered the development of audio-visuomotor mirror systems in young students going from the second year of study course up to the master and beyond. In all, 19 music students were tested: 10 violinists and 9 clarinetists. Their chronological age ranged from 14 to 24 years, while their academic practicing of their instrument ranged from 2 to 18 years. These students (recruited in their instrument classes while waiting to attend a lesson) watched – on a PC screen – and listened – by means of headphones – a total of 400 video clips of professional violinists and clarinetists who played non-melodically 200 totally new combinations of double or single notes that covered all sound heights. Their task was simply to indicate the possible congruence between the gesture and the sound reproduced in each video clip on the basis of their senses. Half of the time, in fact, the sounds were not congruent with the motor gestures but were mounted onto the video track in an incongruous although perfectly synchronized way. The data showed that the actual years of study at the Conservatory correlated directly with the performance in the task. It was as if the more advanced students had so frmly internalized the connection between sound, gesture, and image that they automatically perceived a possible incongruity, with a percentage of error that decreased linearly as the years of practice increased. This happened thanks to the ability of multimodal neurons to create audio-visuomotor correlations that increased with the years of study and practice, regardless of the talent and age of the individual. The frst effects of cerebral modifcation were observable after 4–6 years of intensive study and progressively continued after graduation and master's degree. Up to three years of study, the percentage of error was close to 50%, while only after obtaining the diploma (and about 10,000 h of study), the percentage fell below 10% for music teachers. This research highlighted the crucial role of exercise in shaping brain musical functions, regardless of musical talent.

The same stimuli of the study described above were shown to 12 professional musicians and 12 naïve university students to study in real-life neural mechanisms of audio-visuomotor coding of the musical gesture for their instrument and/or for an unfamiliar instrument (Proverbio et al., 2014). While the musicians watched the stimuli, they had to decide whether the note played was double or single – an easily resolvable task – not only for their instrument but even for an unfamiliar musical instrument. Throughout the task duration, their EEG was recorded in continuous mode by means of 128 sensors placed all over their scalp. Averaged ERPs indicated that audiovisual incongruity generated a prominent N400 mismatch response for the musicians' own instrument only, since it appeared almost impossible for these subjects to reach robust decisions for the unfamiliar instrument. The swLORETA applied to the N400 response identifed the areas mediating multimodal motor processing: the prefrontal cortex (PFC: attention, cognitive discrepancy), the superior and middle temporal gyri (STG and MTG: auditory coding of sound), the premotor cortex (PM: motor programming, simulation), the inferior frontal and parietal areas (IF and IP, mirror system), the extrastriate region for coding of body parts (EBA), the somatosensory cortex (maps of the fngers and the hand), the cerebellum (motor coordination), and the supplementary motor area (SMA), which encodes the learned motor sequences (Fig. 5.10). In conclusion, these data indicate the existence of audio-visuomotor MNs that respond to both visual and auditory incongruent information, thus suggesting that they can encode multimodal learned motor skill representations of musical gestures and sounds.

In summary, we have reviewed a wide neuroimaging and electrophysiological literature reporting the involvement of visuomotor MNs in many mental functions including the comprehension of actions and action intentions, understanding the others' emotional and mental state, action imitation and learning, processing of visuomotor aspects of speech, vocalizations and music, developing motor or musical skills, and many others. Some criticalities still challenge the concept that the human MNs can be viewed as roughly correspondent to the monkey's MNs, for which we have direct neurophysiological recording. First of all, MNs are not always

**Fig. 5.10** Coronal, sagittal, and axial views of the standardized and weighted LOw REsolution electromagnetic TomogrAphy (swLORETA) applied to the N400 bioelectric response generated only for one's own musical instrument. (Taken from Proverbio et al. (2014) and redrawn)

observed while recording from the fronto/parietal areas of the monkey's brain, and their incidence can be very variable, ranging from 8.9% for ventral intra-parietal areas (VIP) to 60% for premotor dorsal areas (PMd). Other criticalities concern the fact that cell-recording studies are not very numerous (also for ethical reasons) and that in humans, evidences are relatively indirect (not based on intracranial recordings). It should be also borne in mind that MNs are only indirectly involved in social and affective processes, such as empathy, contributing for the visuomotor recognition of body language and gestures only.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 6 Sex Differences in Social Cognition**

**Alice Mado Proverbio**

**Abstract** Several studies have demonstrated sex differences in empathy and social abilities. This chapter reviews studies on sex differences in the brain, with particular reference to how women and men process faces and facial expressions, social interactions, pain of others, infant faces, faces in things (*pareidolia*), living vs. non-living information, purposeful actions, biological motion, erotic vs. emotional information. Sex differences in oxytocin-based attachment response and emotional memory are also discussed. Overall, the female and male brains show some neurofunctional differences in several aspects of social cognition, with particular regard to emotional coding, face processing and response to baby schema that might be interpreted in the light of evolutionary psychobiology.

**Keywords** Hemispheric asymmetries · Facial expressions · Parental response · Face pareidolia · Sex hormones

#### **Introduction**

Genetic and hormonal infuences are long known to affect the human brain and determine a variety of anatomical and functional differences between the two sexes (see Hines, 2020, for a review). The cerebral sexual dimorphism would support marked diversities in reproductive, parental, and social behavior. A rapidly increasing literature now documents signifcant sex differences in the reactivity to/effcacy of drugs and pharmaceutical molecules, as well as in the incidence of neurodegenerative, neurological, and psychiatric diseases (see the entire volume dedicated to sex differences in the brain, edited by Cahill, 2017).

Besides anatomical and physiological diversities, some functional and mental differences between men and women have been recently reported by neuroscientifc

A. M. Proverbio (\*)

Department of Psychology, University of Milano-Bicocca, Milan, Italy e-mail: mado.proverbio@unimib.it

studies (e.g. for the following abilities: verbal fuency (Sokolowski et al., 2020), emotion recognition (Connoly et al., 2019; Li et al., 2020), face perception (e.g., Zhou & Meng, 2020), and empathy, as shown by a recent survey examining the empathy quotients of 671,606 individuals (Greenberg et al., 2018).

Several studies have demonstrated sex differences in empathy and related capacities. This chapter reviews studies on sex differences in the brain, with particular reference to how women and men process faces and facial expressions, social interactions, pain of others, infant faces, faces in things (*pareidolia* phenomenon), opposite- vs. own-sex faces, living vs. non-living information, incongruent/inappropriate behavior, motor actions, biological motion, erotic vs. emotional information. Sex differences in oxytocin-based attachment response and emotional memory are also discussed. Overall, the female and male brains show some neuro-functional differences in several aspects of social cognition, with particular regard to emotional coding, face processing, and response to baby schema, which might be interpreted in the light of evolutionary psychobiology.

In this chapter, a recent and comprehensive review of neuroimaging, electrophysiological, and behavioral fndings in the literature supporting the hypothesis of a sex difference in social cognition is provided and discussed, under the framework of cognitive neuroscience and evolutionary psychobiology theories.

The main sex differences in social brain possibly refer to:


#### **Hemispheric Asymmetries for Face Processing**

While it is currently believed that face processing predominantly activates the right hemisphere in humans, some data reveal a lesser degree of lateralization of brain functions related to face coding in women than men. For example, a left hemispheric involvement of the occipito/temporal cortex in women for the processing of human faces has been demonstrated in two independent studies, showing a bilateral pattern of activity of the face fusiform face area (FFA) indexed by N170 ERP response in females, as opposed to the typical male right-sided hemispheric asymmetry (Proverbio et al., 2006b, 2012). In more detail, Proverbio and co-workers (2012) recorded ERPs in 50 right-handed women and men in response to 390 faces of male and female infants, children or adult, and technological objects, in a landscape detection task. Results showed no sex difference in the amplitude of N170 to objects, a much larger face-specifc response over the right hemisphere in men and a bilateral response in women (see Fig. 6.1). Furthermore, a lack of the face-age coding effect was found over the left hemisphere in men, with no differences in N170 to faces as a function of age. Conversely, N170 showed to be sensitive to face age (e.g., differentiating children from adults), over both hemispheres in women.

Overall, these fndings are in line with many studies that show differences between men and women in the degree of lateralization of cognitive and affective processes. Substantial data support greater hemispheric lateralization in men than women for linguistic tasks and for spatial tasks. Sex differences have also been found in the lateralization of visual-spatial processes such as object construction and mental rotation tasks, in which males are typically right hemisphere dominant and females bilaterally distributed. Consistent with this pattern of results are the data provided by Bourne (2005), who examined the lateralization of processing positive facial emotion in a group of 276 right-handed individuals. Subjects were asked to observe a series of chimeric faces with contrasting expressions and to decide which face they thought looked happier. The results showed that males were more strongly lateralized than women, showing a greater perceptual asymmetry in favor of the left visual feld (RH). A similar pattern of results has been reported by Tiedt et al. (2013). Inter-hemispheric transfer-time of face-related inputs seems to be also asymmetric across sexes: N170 recorded in men have faster latencies in the left visual feld (LVF)/RH → LH (170 ms) direction than in the right-visual feld (RVF)/LH → RH (185 ms) direction, while it is symmetric in women (Proverbio et al., 2012). Figure 6.2 shows larger delays in N1 latency (due to callosal transfer) relative to the ipsilateral stimulation, for stimuli presented to the RVF (left hemisphere), in men.

In men, N170 was signifcantly earlier (*p < 0.0007*) for ipsilateral (crossed) responses over the left than right hemisphere. This effect was not found in women, who showed an IHTT of equal latency in the two directions. As for contralateral (uncrossed) responses, N170 was earlier over the RVF/LH than LVF/RH in women and of equal latency for both hemispheres in men. One potential explanation of the fndings is that interhemispheric transfer time (IHTT) would be more rapid and symmetric in women than men. Notwithstanding the large electrophysiological literature in favor of this hypothesis, in neuroimaging domain, the standard and to-beexpected pattern of lateralization for face processing is still considered to be the right-sided activation of the fusiform gyrus and of the right occipital face area for both sexes (e.g., Jacques et al., 2019).

**Fig. 6.1** Isocolor voltage topographical maps (left- and right-side views) showing N170 scalp distribution in female and male observers. N170 response is relative to adult face processing. The time window corresponds to its peak (150–170 ms) of maximum activation. (Taken from Proverbio et al. (2012), with permission from the authors and the editor)

#### **Affective Facial Expressions and Emotions**

Several studies have provided evidence of a woman's greater accuracy in interpreting emotional states and mind reading (Babchuk et al., 1985; Wingenbach et al., 2018). One potential explanation of these fndings is that the primary role of female humans (and primates in general) in breastfeeding and rearing young offspring would have improved their ability to interact with them affectively and to understand their non-verbal behavior.

In this regard, Proverbio et al. (2007) examined the roles of sex and expertise in interpreting infant expression in a group of 34 men and women differing in their experience with infants (Fig. 6.3). The participants were subdivided into two groups (experts or non-experts) on the basis of their specifc familiarity with infant facial

**Fig. 6.2** N170 latency values (along with SD) recorded in women and men in response to lateralized faces, as a function of cerebral hemisphere and stimulus contra-laterality (collapsed across occipito/temporal electrode sites). In this study, ERPs were recorded in strictly right-handed people (16 men and 17 women) engaged in a face-sex categorization task. Occipital P1 and occipito/ temporal N170 were left lateralized in women and bilateral in men. N170 to contralateral stimuli was larger over the RH in men and the LH in women. Inter-hemispheric transfer time (IHTT) was approximately 4 ms at the P1 level and approximately 8 ms at the N170 level. It was asymmetric in men, with faster latencies in the left visual feld (LVF)/RH → LH (170 ms) direction than in the right-visual feld (RVF)/LH → RH (185 ms) direction and symmetric in women. These fndings suggest that the asymmetry in callosal transfer times might be due to faster transmission times of face-related information via fbers departing from the more effcient to the less effcient hemisphere (Proverbio et al., 2012)

expressions. In detail, individuals considered "non-expert" were those without children, nieces or nephews, and without a specifc familiarity/skill with neonates or pre-school age children acquired through professional activities. In contrast, individuals with natural or adopted children, nieces or nephews under the age of 5 years old, as well as nursery school teachers or infant school teachers were considered "experts." Women showed a signifcantly higher level of decoding accuracy compared to men; furthermore, expertise positively affected facial expressions decoding among women only. These results suggest that in judging emotional facial expressions of infants, there is an interaction of biological (i.e., sex) and cultural factors (such as familiarity with infantile mimicry).

In an electrophysiological study performed on the same set of stimuli (Proverbio et al., 2006b), it was investigated whether viewers' sex affected the visual cortical

**Fig. 6.3** Examples of photographs used as stimuli, as a function of facial expressions (Proverbio et al., 2007). The upper row shows positive emotional states with strongly positive emotions, such as joy on the left, and mildly positive ones, such as comfort or peacefulness, on the right. The lower row shows negative emotional states with the mildly negative emotions, such as discomfort or disappointment, on the left and strongly negative ones, such as displeasure or pain, on the right

response at various stages of perceptual processing during a judgment task of infant happy/distressed expression. All infants were unfamiliar to viewers. The lateral occipital P110 response was much larger and occurred earlier in women than in men, regardless of facial expression, thus indicating a sex difference in early visual processing. Furthermore, P110 latency was earlier in response to distressed than neutral children in women only, thus possibly showing a prioritized processing of biologically relevant information in the female brain (Fig. 6.4).

The role of viewer sex in the emotional evaluation and psychological reactivity to human faces of various age, sex, and typology has been deeply explored. Table 6.1 shows some of the main gender differences in facial expression processing.

In general, perception of aversive faces would activate an amygdala-based arousal response able to affect general stimulus processing (Phelps & LeDoux, 2005). Furthermore, stimuli inducing greater arousal in the percipient would be subjected to a prioritized processing because of their biological relevance, for example, infant faces would trigger an instinctive parental response. In this respect, it has been reported that erotic stimuli are particularly arousing for men as compared to women. Sabatinelli et al. (2004) provided fMRI evidence that perception of erotic pictures is associated with a much larger activation of the extra-striate visual cortex in men vs. women, while Huynh et al. (2012) showed the opposite effect in women, with high-intensity erotic visual stimuli de-activating the primary visual cortex as compared to low-intensity erotic movies and neutral movies. Conversely,

**Fig. 6.4** Mean latency (in ms) of the P1 component (along with SD) recorded at the lateral occipital area (independent of hemispheric site) and analyzed according to subjects' sex and type of facial expression. (Taken and modifed from Proverbio et al., 2006b study, with permission of the authors and the editor)



perception of body mutilations (stimulating the empathic circuits) would be associated with a stronger activation of the extra-striate visual cortex in women vs. men. These fndings have been interpreted by the authors of the studies according to the hypothesis that the degree of cerebral arousal and mobilization of attentional resources devoted to stimulus processing would depend on its biological relevance for the observers. It is well known that women respond differently than man to erotic information. Some women feel repulsed by muscular, erotic male photos. In general, while men are more sexually aroused by visual stimuli, women seem to be more sexually aroused by auditory, tactile, or emotionally relevant information (see Chung et al., 2013).

In a recent study (Proverbio, 2017), 15 male and female university students evaluated 400 human faces of various age and sex according to the parameters of arousal and valence. The same face set was preliminary validated (and sex-matched) by a group of 20 independent judges (10 men and 10 women) who were asked to evaluate the degree of trust inspired by each face by means of a 3-points Likert scale. The aim was to explore the possible interaction of facial characteristics with judges' sex and age. Participants shared their ethnicity (which was Caucasian) with that of the observed faces (therefore, ethnicity or "race" was not a factor in this study, nor was the so-called "*other-race effect"* (ORE; Caldara et al., 2004; Proverbio et al., 2011a).

Overall, the data collected in this study (Proverbio, 2017), relative to heterosexual young adults, showed a sex difference in the evaluation of human faces along the arousal and valence dimensions. Specifcally, an opposite-sex preference (with higher valence ratings) was found only in men, in favor of female adolescents (but not mature women), thus strongly interacting with face age. There was only a tendency for women participants in preferring male faces, possibly because of a lack of specifc aesthetic value (faces were selected as normotypical) and the presence of negative facial expressions (such as hate, hostility, disgust) making some faces not particularly attractive. Female subjects showed a preference for the faces of children and the elderly (as compared to other age ranges) in the arousal evaluation. The female appreciation of elderly faces might be interpreted in the light of a greater empathetic attitude for fragile persons, whereas the female preference for children faces would rely on specifc neural mechanisms sensitive to child-like cues in face stimuli. Overall, women rated all human faces as more arousing and more positive than men, possibly indicating a preference, or greater interest, for faces, facial expressions, and social information in general (Proverbio et al., 2008). This piece of evidence fts with the Baron-Cohen model of sexual dimorphism in empathy and facial expression coding ability (Baron-Cohen et al., 2001; Baron-Cohen and Wheelwright, 2004). In the light of this framework, it can be proposed that the higher female ratings of valence and arousal found in the present study might refect a greater attentional allocation to (or interest for) human faces as sensory signals (Pavlova et al., 2014, 2015).

While it seems that females generally are signifcantly faster and more accurate at emotion recognition, some studies failed to show consistent gender differences while varying experimental conditions (Klein & Hodges, 2001).

#### **Parental Response**

The viewers' age and the possible interaction with face age are also explored in the literature on the so-called *baby schema* effect, which predicts a preference for, and a perceptual advantage of infant vs. adult faces (Brosch et al., 2007; Glocker et al., 2009a; Luo et al., 2011; Proverbio et al., 2011a, b), as nicely reported in a review by Hahn and Perrett (2014).

The literature shows that the adult visual and the orbitofrontal cortices are specifcally activated and aroused by the view of infants, also providing a pleasant sensation through the dopaminergic reward circuitry. This would happen to a greater extent in women than men, according to some authors (Hahn et al., 2013; Nitschke et al., 2004; Parsons et al., 2011, 2013). Indeed, behavioral studies showed how women might be more responsive to baby schema than men and better able to decode infant expressivity (Proverbio et al., 2007; Babchuk et al., 1985). In an electrophysiological study (Proverbio et al. 2006a, b) aimed at investigating the neural response to baby schema in female and male adult individuals, ERP results revealed a larger sensory P100 response to faces in women than in men (irrespective of whether they were parents themselves or nulliparous). These fndings may possibly be interpreted as a sign of greater perceptual sensitivity (or increased arousal response) in women than men at the view of unrelated infants. Similar studies have shown that infant faces hold greater incentive salience for women than they do for men (Hahn et al., 2013; Parsons et al., 2011, 2013). Again, infant faces have been shown to capture women's attention to a greater extent that adult faces, whereas infant faces capture men's attention more so than same-sex faces, but much less than opposite-sex faces (Cárdenas et al., 2013).

Several recent neuroimaging studies (Glocker et al., 2009b; Kringelbach et al., 2008; Leibenluft et al., 2004) have investigated the neural circuits subtending the so-called "parental response" to infants and identifed a set of structures predominantly involving the orbito/frontal cortex devoted to social cognition and belonging to the dopaminergic reward system. The neural correlates of "maternal love" have been investigated by recording the brain activation of mothers viewing pictures of their own children. The results showed activation of brain areas linked to affect (amygdala) and in particular positive emotions (orbitofrontal cortex and connected regions belonging to the pleasure/reward circuitry such as the periaqueductal gray matter). The possible role of oxytocin in maternal love has also been determined in an electrophysiological study (Peltola et al., 2014) testing the associations of motherhood and oxytocin receptor genetic variation with neural and behavioral responses to emotional expressions of infants and adults. It was found that mothers (vs. nonmothers) and individuals carrying the rs53576 GG variant of the *OXTR* gene (vs. A-carriers) showed enhanced ERP differentiation of infants' strong versus mild intensity facial expressions (i.e., pleasure and distress vs. comfort and discomfort).

Overall, the parental role (having own children) has been associated with a greater sensitivity to infant facial expression. In an electrophysiological study performed in parent vs. nulliparous adults, it was shown that the perceptual N160 response refected the earliest discrimination of mild vs. strong painful facial expressions in parents (especially in mothers) but not in nulliparous individuals. These fndings possibly suggest a strong interactive infuence of genetic predisposition and parental status on the responsivity of visual brain areas (Proverbio et al., 2006a). Again, the data showed larger P3 responses in mothers versus all other groups (including fathers and nulliparous women), possibly indicating a greater perceptual sensitivity (or increased arousal response) in mothers, at the view of unrelated infants (Fig. 6.5).

As for the auditory modality, other studies have demonstrated a female vs. male enhanced response to the infant vocalizations (cry and laughter) (Sander et al. 2007; Seifritz et al., 2003) supporting the hypothesis of a sex difference in the parental response to infantile communicative signals.

#### **Interest in Social Stimuli**

In Proverbio's (2017) previously described study, regardless of faces' sex, women's ratings were signifcantly higher for both arousal and valence dimensions, thus suggesting that women might be more interested or aroused by the specifc sensory stimulus (the human face). This data fts with some electrophysiological literature providing evidence of a greater female electro-cortical responsivity to faces and people than to inanimate scenarios such as landscapes.

In a study by Proverbio et al. (2008), 24 men and women viewed 220 images portraying persons or landscapes (see Fig. 6.6 for some examples of stimuli) and ERPs were recorded from 128 sites. In women, but not in men, the N2 component (210–270 ms of latency) was much larger to persons than to scenes. Inverse solution (swLORETA) showed signifcant bilateral activation of face-devoted areas (namely, the fusiform gyrus, BA19/37) in both sexes when viewing persons as opposed to scenes. However, only women showed a source of activity in the superior temporal gyrus (STG) and in the right middle occipital gyrus (MOG), extrastriate body area (EBA), and only men in the left parahippocampal area (PPA). This was interpreted

**Fig. 6.5** ERPs signals recorded over left and right lateral occipital sites following presentations of infant facial expressions exhibiting strongly negative emotions, according to viewer group. Smaller P300 amplitudes were recorded in fathers vs. mothers, especially with infant expressions of suffering. (Taken from Proverbio et al., 2006a, with authors' and editors' permission)

**Fig. 6.6** Examples of social and nonsocial stimuli used to evaluate the interest in social information, regardless of stimulus color richness and perceptual complexity. (Taken from Proverbio et al. (2008)'s study)

as an index of a greater female interest in, or attention to, this class of biologically relevant signals (human faces and bodies).

Whatever the cause, little neuroscientifc evidence of such preference of the female brain for social stimuli has been reported, in contrast to the large body of behavioral evidence showing that females have greater social and affective competence. For example, substantial literature has accumulated indicating that women are better than men at decoding facial expressions of emotion (Thomson & Voyer, 2014). Various studies have demonstrated differences between the ways in which men and women perceive (Proverbio et al., 2016), process (Canessa et al., 2012), express (McDuff et al., 2017), and experience emotions (Proverbio et al., 2009). Research generally suggests that women are more able, as well as more inclined, to express their own emotions to conspecifcs (McDuff et al., 2017). Furthermore, they show greater ease in decoding non-verbal indicators connected to the expression of emotions. It has been reported that female children across various human cultures are prone to spend more time with their younger siblings, or their simulacra (baby dolls), than are their male counterparts. It is quite diffcult to determine whether this socially oriented behavior is entirely due to cultural factors (such as the style of upbringing) or to a biological difference dependent on genetic factors. Since, in Proverbio's study (2008), showing a greater interest for social stimuli, no behavioral response or attention allocation to social information was required by the task (consisting in detecting rare Mondrian pictures), the stronger responsivity to persons than landscapes in women would refect a privileged processing of images depicting conspecifcs in the female brain. Consistent with this hypothesis, numerous studies (e.g., Wingenbach et al., 2018) have demonstrated that women are provided with a greater ability to decipher the emotions through facial expressions or other nonverbal communication than man and are more inclined and more competent in expressing their emotional experiences to others (Dimberg & Lundquist, 1990). Further evidence has demonstrated that women, as compared to men, react more strongly when viewing affective stimuli (such as IAPS) involving human beings, thus showing higher empathic responses (Proverbio et al., 2009). In this regard, some authors have established a link between sex, social skills, and action processing because of the strong association between the known action observation/execution properties of the motor mirror system and the theorized social functions of the human mirror system (Oberman et al., 2007).

#### **Action and Body Language Understanding**

Several sex differences have been reported in action understanding tasks. Female participants have been found to be better at understanding the action purpose as compared to men, as indexed by earlier and larger discriminative ERP responses to incongruent and purposeless behavior (Proverbio et al., 2010c). Perception of plausible and understandable actions (e.g., smiling couple clinking glasses of champagne) was contrasted with that of implausible and unintelligible actions (e.g., businesswoman balancing on one foot in the desert). ERP data showed early processing of the action's purpose in the female brain, with a larger parietal N200 to understandable behavior. Source reconstruction (swLORETA) located the neural generators of this effect in the inferior/parietal, left inferior/frontal, left and right premotor areas, right cingulate cortex, right superior/temporal and extra-striate cortex belonging to the so-called "human mirror-neuron system (MNS)." Anterior N400 discriminative response (implausible–plausible) was greater in women than men (see Fig. 6.7). The data suggest that congruent/incongruent actions are processed differently from the two sexes, with a prevalence of limbic and cingulate activation in women, and orbito/frontal one in men, along with a right STG activation of comparable amplitude in men and women.

Consistently, the combined fMRI and ERP study by Canessa et al. (2012) and Proverbio et al. (2011c) found differences across male and female participants involving a stronger activation of the action understanding system, the STS, and the ventral premotor cortex (associated with the mirror resonance of others' actions) during the observation of cooperative (vs. affective) scenes in women. Again, other studies provided evidence of sex differences in the development of brain mechanisms for processing biological motion (Anderson et al., 2013). In an fMRI study involving the visual perception of point-light displays of coherent and scrambled biological motion, enhanced activity during coherent biological motion perception was found in females relative to males in a network of brain regions possibly implicated in social perception, including amygdala, medial temporal gyrus, and temporal pole (Anderson et al., 2013). All in all, these pieces of evidence indicate a female superiority in social skills and sex differences in action/behavior processing.

**Fig. 6.7** ERP difference waves obtained by subtracting ERPs to congruent from ERPs to incongruent actions separately for men and women, over anterior scalp sites. A much larger N400 response occurred to incongruent actions in women than men. (Taken and modifed from Proverbio et al., 2010c)

#### **Face Pareidolia**

Recent behavioral and electrophysiological research has shown that women are better at seeing faces, even when there are none, a perceptual illusory phenomenon called "face pareidolia" (i.e., the illusory perception of non-existent faces). Sometimes, while observing the clouds in the sky, coffee foam, or random decorative patterns, we might be struck by the impression of clearly perceiving a face that is so well defned and yet so illusory. This perceptual effect has precise neural underpinnings based on the face fusiform area.

Pavlova et al. (2015) carried out a spontaneous recognition task in which adult females and males were presented with a set of food plate images resembling faces (Arcimboldo style). Not only did women more readily recognize the images as a face (they reported images as resembling a face, on which males still did not), but gave overall more face responses. Proverbio et al. (2016) investigated the neural correlates of this sex difference, in a study in which ERPs were recorded while participants viewed pictures of animals intermixed with that of familiar objects, faces, and faces-in-things. Overall, compared to the men, the women were signifcantly more inclined to perceive faces in perfectly real object photographs, as shown in the preliminary face-likeness ratings assessment.

Furthermore, face-specifc Vertex Positive Potential (VPP, 150–190 ms) showed a difference in the processing of faces-in-things between males and females at frontal sites; while for men VPP was of intermediate amplitude between faces and objects, for women there was no difference in VPP response to faces or faces-inthings, thus suggesting a marked anthropomorphization of objects in the latter group (Fig. 6.8). SwLORETA source reconstruction showed how in the female brain, face *pareidolia* was associated with the activation of brain areas involved in the affective processing of faces (right STS, BA22; posterior cingulate cortex, BA22; orbitofrontal cortex, BA10), which was not found in men. Normally the visual cortex separates face processing from object processing so that faces are automatically processed in ways that are inapplicable to objects (e.g., gaze detection, gender detection, and facial expression coding). However, the present data showed sexual dimorphism, with this dichotomy being much stricter in men than women because of an anthropomorphizing bias in the female brain.

#### **Empathy for Pain**

Recent fndings have demonstrated that women might be more responsive than men to the sight of painful stimuli (triggering a vicarious response to pain), and therefore more empathic (Han et al., 2008). We investigated whether the two sexes differed in their cerebral responses to affective pictures portraying humans in different positive or negative contexts compared to natural or urban scenarios (Proverbio et al., 2009). Four-hundred-forty IAPS slides were presented to 24 Italian students (12 women

**Fig. 6.8** (Top) ERP waveforms recorded in women and men as a function of stimulus type. VPP was much larger to faces and faces-in-things than objects in women. (Bottom) Mean amplitude of the N170 response recorded as a function of stimulus type and relative scalp distribution

and 12 men). An emotional impact scale was administered to all participants prior to EEG recording, showing higher emotional psychological reactions in women than men to a variety of emotional stimuli (both animated and unanimated ones), as shown in Fig. 6.9.

Occipital P115 response of ERPs was greater in response to persons than to scenes and was affected by the emotional valence of the human pictures. A possible explanation for this piece of evidence is that the processing of biologically relevant stimuli was prioritized in both sexes. A late positivity to suffering humans (visible in Fig. 6.10, blue line) far exceeded the response to negative scenes in women but not in men. Increased right amygdala and right frontal area activities were observed only in women. These data possibly indicate a sex-related difference in the brain response to humans, possibly supporting human empathy.

Previous studies have demonstrated that females show greater responsiveness in various brain areas to generically negative pictures, but to date, none has investigated the specifc role of the presence of humans in determining the brain emotional response in both sexes. For example, Hofer et al. (2007) found larger activation of the right superior temporal area, right insula, right putamen, and anterior cingulate cortices during the processing of positively valenced words versus non-words for

**Fig. 6.9** Data obtained from the emotional impact scale (self-reporting questionnaire) administered to the 24 persons participating in the ERP experiment, separately for each image type, and according to their sex. Key: 0 = not at all; 1 = a little; 2 = fairly; 3 = very much; 4 = extremely

women versus men and interpreted these data in terms of the greater emotionality of the female sex. On the other hand, Klein et al. (2003) found increased activation of the amygdala and ACC in women in response to negative IAPS images. In our study, sex differences as a function of the affective valence of pictures were much greater for humans than scenes, thus indicating the special status of the visual image of humans for the female brain, especially in interaction with affective information. Our data are consistent with the more recent literature suggesting that women are more empathic than men are when viewing suffering humans (Han et al., 2008; Schulte-Rüther et al., 2008; Singer et al., 2004).

#### **Sexual Hormones and Oxytocin**

The literature has shown that social processes, and in particular, the neural response to opposite-sex faces, may vary as a function of hormonal phase of women. Furthermore, oral contraceptive pill use can affect cognition and alter resting state

**Fig. 6.10** ERPs recorded at right parietal sites as a function of stimulus content and valence and viewer's sex. A large effect of both emotional content of the stimulus is visible (evidenced by comparing ERPs to negative vs. positive unanimated scenes) and an effect of empathy for pain, especially in women (evidenced by comparing ERPs to negative scenes vs. ERPs to pictures portraying humans)

functional connectivity. Indeed, women using oral contraceptives have been shown to differ from non-pill users in memory, mental rotation, and affective memory tasks (Nielsen et al., 2011, 2014). In conclusion, the hormonal control, or lack of it, represents an important variable in determining the neurofunctional behavior of the female brain, and it should be monitored in studies on sex differences.

Several authors (Alexander & Hines, 2002) pointed out the genetic/biological nature of female preference for social stimuli. For example, evidences of toy preference in nonhuman primates (*Cercopithecus aethiops sabaeus*) have been provided, with male vervets preferring to play with unanimated fast-moving toys (e.g., cars or balls) and female vervets preferring the contact with dolls. These data suggest that sexually differentiated interest for infants/dolls arose early in human evolution, prior to the emergence of a distinct hominid lineage. Comparative studies are quite relevant at this regard since monkeys are not subject to the cultural infuences proposed to explain human sex differences in social cognition.

Furthermore, other fndings support the hypothesis of biological, predetermined sex differences in social interest, not dependent on cultural conditioning, but linked to the genetic role of women as primary offspring caregivers. One of the most important pieces of evidence is the observation of an early interest for infants traceable in all human cultures and historical periods in young females. Remarkably, the same phenomenon has been observed in monkeys (juvenile baboons, macaques, and rhesus monkeys: Herman et al., 2003; Maestripieri and Roney, 2006) as refected by a higher rate of interaction with infants in females than males. The interaction includes behaviors like embracing, holding, carrying, playing, grooming, touching, staying close to, and it is unaffected by hormone manipulations. According to Maestripieri and Pelka (2002), sex differences in interest in infants across the lifespan should be interpreted as a biological adaptation for parenting. Neuro-hormonal studies carried out in humans have shown that the early interest for infants may be modulated by hormonal factors. For example, Leveroni and Berenbaum (1998) reported that girls precociously exposed to high levels of androgens (because of congenital adrenal hyperplasia) displayed less interest in infants than their normal sisters. Consistently, it has been shown in primates that maternal hormonal changes infuence social interaction with unrelated infants (Ramirez et al., 2004), making adult females more empathic and receptive. In this regard, oxytocin has been shown to affect the empathic attitude in humans, by increasing social trust, and even improving the ability to infer affective mental states of others (Domes et al., 2007).

#### **Conclusion**

On the basis of a review of the relevant literature, it is concluded that many of the sex differences in social cognition may be related to the (biologically determined) role of females as primary offspring caregivers (as opposed to fghters/hunters, e.g., Kuhn and Stiner, 2006). This distinction may be associated with females' greater empathic attitude, ability to understand body language and facial expressions, attachment and responsivity to infants (Oxytocin-mediated), early interest for infants, interest for social information, emotional responsivity, lesser incidence of autistic, psychopathic and sociopathic disorders. In this way, this chapter provides a unifed framework for understanding the multifaceted consequences of a sexual dimorphism in human parental behavior.

**Acknowledgements** We wish to thank Roberta Adorni, Valentina Brignone, Valeria De Gabriele, Marzia Del Zotto, Jessica Galli, Valentina Lozano, Eleonora Martin, Silvia Matarazzo, Roberta Mazzara, Mirella Manfredi, Laura Paganelli, Federica Riva, Laura Trestianu, and Alberto Zani for their kind contributions.

Supported by 13974 2015-ATE-0052 grant entitled "Emotional responses and gender differences in individuals with high traits of psychopathy, impulsivity and empathy" from University of Milano–Bicocca.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 7 Development of Morality and Emotional Processing**

**Lucas Murrins Marques , Patrícia Cabral, William Edgar Comfort , and Paulo Sérgio Boggio**

**Abstract** Emotions play a very important role in moral judgments. Hume argues that morality is determined by feelings that make us defne whether an attitude is virtuous or criminal. This implies that an individual relies on their past experience to make a moral judgment, so that when the mind contemplates what it knows, it may trigger emotions such as disgust, contempt, affection, admiration, anger, shame, and guilt (Hume D. An enquiry concerning the principles of morals, 1777 ed. Sec. VI, Part I, para, 196, 1777). Thus, even so-called "basic" emotions can be considered as moral emotions. As Haidt (The moral emotions. In: Handbook of affective sciences, vol 11, 852–870, Oxford University Press, 2003) points out, all emotional processing that leads to the establishment and maintenance of the integrity of human social structures can be considered as moral emotion. Consequently, the construct of "morality" is often characterized by a summation of both emotion and cognitive elaboration (Haidt J. Psychol Rev, 108(4):814, 2001).

**Keywords** Emotional processing · Morality · Moral psychology

#### **Introduction**

Emotions play a very important role in moral judgments. Hume argues that morality is determined by feelings that make us defne whether an attitude is virtuous or criminal. This implies that an individual relies on their past experience to make a moral judgment, so that when the mind contemplates what it knows, it may trigger emotions such as disgust, contempt, affection, admiration, anger, shame, and guilt

L. M. Marques (\*)

Instituto de Medicina Fisica e Reabilitacao, Hospital das Clinicas HCFMUSP, Faculdade de Medicina, Universidade de Sao Paulo, Sao Paulo, Brazil

P. Cabral · W. E. Comfort · P. S. Boggio

Social and Cognitive Neuroscience Laboratory, Developmental Disorders Program, Center for Health and Biological Sciences, Mackenzie Presbyterian University, Sao Paulo, Brazil

(Hume, 1777). Thus, even so-called "basic" emotions can be considered as moral emotions. As Haidt (2003) points out, all emotional processing that leads to the establishment and maintenance of the integrity of human social structures can be considered as moral emotion. Consequently, the construct of "morality" is often characterized by a summation of both emotion and cognitive elaboration (Haidt, 2001).

According to the Social Intuitionist Model (Haidt, 2001), moral judgment is substantially infuenced by "intuitions," i.e. automatic affective reactions. In turn, these intuitions appear to have evolved from physiological reactions in response to external threats and opportunities over our phylogenetic history (Bloom, 2012) and now play a role in resolving situations that threaten the integrity of human social structures (Haidt, 2003). A later hypothesis, the Moral Foundations Theory (MFT; Haidt & Joseph, 2004) based on the assumptions in the Social Intuitionist Model, posits that these intuitions emerge whenever at least one of the six universally human "moral foundations" is violated: (i) Care; (ii) Fairness; (iii) Loyalty; (iv) Authority; (v) Sanctity; and (addended by Haidt, 2012) (vi) Liberty.

In summary, the violations of these six foundations can be described and exemplifed as follows (Graham et al., 2013; Graham et al., 2011): (i) Care/harm– situations that involve impairment in emotional and physical care between humans and humans in relation to animals (e.g., Physical aggression in response to an affective betrayal); (ii) Fairness/cheating–situations involving cheating (e.g., The use of public money for personal purposes); (iii) Loyalty/betrayal–situations in which an individual shows disloyalty toward a person or entity (e.g., An employee who works simultaneously for a competing company); (iv) Authority/subversion– situations involving disrespect and disregard for a fgure of authority (e.g., Talking loudly during a religious ceremony); (v) Sanctity/degradation–situations involving the "degradation" of moral principles (e.g., Engaging in sexual behavior such as incest); and (vi) Liberty/oppression–situations involving the restriction of personal freedom (e.g., Forcing individual to wear a specifc item of clothing).

As presented by Haidt (2008), the frst three characterize foundations oriented toward the valuation of the individual (Individualizing Foundations), while the last three value the collective (Cohesive Foundations). In this sense, the recent literature on moral processing is based on the assumptions of TFM (Haidt, 2003, 2008, 2012; Graham et al., 2013), stimulating, for example, the development of instruments such as the Moral Foundations Questionnaire (Graham et al., 2011) and the Moral Foundations Vignettes (Clifford et al., 2015). On the other hand, the degree to which moral dilemmas are involved in the processing of emotions varies consistently with the infuence that emotion has on moral judgment (Greene et al., 2001). However, Haidt and Greene disagreed about the role of reason in moral psychology because of Greene's belief in the relevance of thought in a manual way—which is the rational and controlled judgment system—in contrast to the automatic mode, regulated by emotion and intuition, defended by Haidt (2001), who considers emotion as the only source of moral judgment, rationalized by the manual mode (Greene, 2013). In addition, it is estimated that the moral judgment changes according to social and cultural infuences (Haidt et al., 1993). However, this conception contrasts with the widespread belief in the twentieth century that a rational and deliberate process takes part in the moral decision (Kohlberg, 1969; Turiel, 1983). Although the notion that judgment is based on the emotional implications of morality is strong, the evidence is still considered insuffcient and unproven by some, who argue that emotions can have little infuence on moral judgment (Huebner et al., 2009). The recent literature on moral processing is based predominantly on the assumptions of the MFT (Graham et al., 2013) and forms part of the theoretical framework for the development of research instruments such as the Moral Foundations Vignettes (MFVs; Clifford et al., 2015). Furthermore, a group of researchers have recently criticized the MFT, arguing that it fails to cite specifc activation modules for triggering the violation of each foundation (and an ensuing affective reaction). In the face of these criticisms, in addition to the importance of factors such as *Nativism*, *Cultural Learning*, *Intuitionism,* and *Pluralism* to account for the development of personal morality (see Graham et al., 2013, for a more in-depth analysis), a group of researchers predominantly represented by Kurt Gray have recently developed the Theory of Dyadic Morality (TDM; Schein & Gray, 2018), which suggests that morality or moral violations are represented socially through different forms of harm, but nevertheless have the same ontological basis.

As highlighted by Pizarro (2000), emotions are typically understood as processes antagonistic to moral judgments, sometimes not considering their impact on judgment processes, sometimes assuming that emotions harm judgments. However, a series of contemporary studies points out the close relationship between the two phenomena, frequently highlighting the causal role that emotional modulation plays in the impact of moral judgment (Haidt et al., 1993; Schnall et al., 2008). This impact sometimes contributes to judgment, in cases where, for example, emotional disgust related to a moral violation guides the recrimination of such a violation. On the other hand, emotions can also guide immoral behavior, for example, in cases where positive effects guide acts of injustice or corruption, such as those often observed in political contexts.

Attitudes and judgments can be taken automatically, without necessarily reasoning, based on pre-established concepts or in a complex way, using different perspectives (Van Bavel et al., 2015). As noted by Koenigs et al. (2007), some brain structures are related to more deontological moral judgments, and when these structures suffer brain injuries, the most intuitive judgments predominate, demonstrating that moral judgments are present in both situations. However, cognitive processes may be present to a greater or lesser extent. Moreover, it is worth mentioning that some studies have demonstrated that emotional intuitions can signifcantly impact moral judgment and reasoning both in adults and children (Danovitch & Bloom, 2009; Malti & Ongley, 2014). As such, differences in moral judgment at distinct stages of development may often be due to individual differences in the development of emotional processing and the regulation of these emotional intuitions (Eisenberg, 2000).

#### **Emotional Processing**

Several studies have identifed overlapping areas in the brain responsible for both moral judgment and emotional processing, including the insula (Vicario et al., 2017; Ying et al., 2018), amygdala (Decety et al., 2012; Harenski et al., 2014), orbitofrontal cortex (OFC; Fumagalli & Priori, 2012) and ventromedial prefrontal (PFC; Shenhav & Greene, 2014; Pascual et al., 2013), and anterior cingulate cortex (ACC; Pascual et al., 2013). Moll and Oliveira-Souza (2007) suggest that this overlap may be due to the dependence of moral reasoning and judgment on the engagement of multiple emotion-related systems in the brain, citing the ventromedial PFC as one of the key nodes in this network as an interface between emotional experience and moral decision-making.

There have been frequent reviews of research into moral judgment and decisionmaking due to the increasing importance of moral behavior and reasoning in modern life. Several reviews have dedicated themselves to establishing a neural basis responsible for the cognitive processes underlying moral reasoning (Forbes & Grafman, 2010; Van Bavel et al., 2015). A greater understanding of the physiology of the "moral brain" has been possible by the so-called boom of functional neuroimaging studies (Greene & Haidt, 2002). Verplaetse et al. (2014) have identifed some of the key nodes in a neural system subserving moral cognition, including (i) medial frontal gyrus; (ii) the superior temporal sulcus; (iii) the temporoparietal junction; (iv) orbitofrontal cortex; (v) ventromedial PFC; and (vi) dorsolateral PFC. In particular, some structures of the PFC deserve to be highlighted as they have a distinct impact on the cognitive and social processes underlying moral judgment (Forbes & Grafman, 2010).

The dorsolateral PFC has been implicated in many aspects of moral intuition; Forbes and Grafman (2010) suggest an auxiliary function of the right dorsolateral PFC in the integration of complex emotional responses that are generated by the evaluation of information from the context that is being judged, increasing the weight of emotion in this decision. However, Greene et al. (2004) fnd evidence that demonstrate greater involvement of the same region in more diffcult personal moral dilemmas, which require greater rational cognitive processing.

On the other hand, Greene (2007) found that patients with lesions in the ventromedial PFC showed more utilitarian moral judgments, with less cognitive elaboration. More recently, another study regarding group categorization demonstrated that the ventromedial PFC showed greater activation in situations in which participant evaluated themselves as belonging to a specifc group, compared to situations in which they did not belong (Molenberghs & Morrison, 2012), revealing the role of ventromedial PFC in social categorization as well. However, the studies described above only reveal correlations between different forms of moral judgment and brain activation.

Interestingly, emotions themselves may be moral in character, including such complex emotions as guilt, shame, and righteousness (Turner & Stets, 2006). These moral emotions often signal emotional arousal in response to moral violations or conformity but may have a primary role as "triggers" for more basic emotions such as anger, fear, and hatred. Similarly, our emotional reactions to moral violations of fairness and our propensity to engage in prosocial behavior have been shown to depend on similar neural substrates as reactions to situations eliciting disgust (Sanfey et al., 2003; Tabibnia et al., 2008).

#### **Morality**

Relatively few studies have been published on the development of the psychological and neural underpinnings of moral judgments. To date, the primary theories in this feld continue to be those proposed by Jean Piaget and Lawrence Kohlberg, two of the most signifcant scholars of moral and cognitive development of the twentieth century, who saw morality primarily in terms of justice, care, and respect for authority (Bloom & Wynn, 2016).

#### *Piaget*

For Piaget et al. (1989), moral values are constructed from the interaction between the subject and the various social environments which he/she engages with, and it is through daily coexistence with others in adulthood that we build our moral values, principles, and norms. Processes of internal organization and adaptation are necessary for these interactions to occur, which Piaget's model categorizes as interactions of assimilation and accommodation. Assimilation schemas vary according to the stage of individual development and are defned as strategies for confict resolution based on pre-existing cognitive structures and knowledge. Furthermore, Piaget argues that the development of morality is composed of three phases: (i) a "premoral" phase, (ii) a "heteronomous" phase, and (iii) an "autonomous" phase.

The frst "pre-moral" phase, present in children of up to 5 years of age, is where the child bases their rules of conduct on their immediate needs instead of a set of moral norms which supersede behavior. When the child obeys an internally generated rule, the behavior is reinforced through habit and not by a sense of right and wrong. A baby who cries until fed is an example of moral behavior in this phase.

The second phase, that of heteronomous morality, is typically present in children aged 5–10 years. In this stage, morality corresponds to behavior, which complies with social rules and norms, with any interpretation other than this does not correspond to a correct attitude. A poor man who steals medicine to save his wife's life is committing an equal moral wrong as a man who murders his wife, according to heteronomous reasoning.

Finally, during the third phase of moral development, autonomous morality, individuals set moral codas and rules by mutual agreement.

However, as pointed out by Vozzola (2014), there are stronger points that should be considered in Piaget's classical theory, such as the interference of the

environment in development and what we can structure in order to stimulate the child, but there are also other aspects that must be considered, such as the fact that Piaget underestimates the role of culture and education in fostering cognition and moral development. This important role of cognitive development in moral development is evident in a study by Smetana and Ball (2018) showed that children make distinctive moral judgments regarding physical damage and psychological damage (both from Care Foundation) because the frst is concrete while the latter may have no direct and observable consequences, and therefore requires a more advanced understanding of the thoughts and feelings of others (Helwig et al., 2001; Smetana et al., 2012). In particular, understanding young children's judgment relative to psychological damage is hampered by the diffculty in coordinating moral assessments with an understanding of intentions, actions, and outcomes (Jambon & Smetana, 2014).

#### *Kohlberg*

Kohlberg (1976) divided moral development into intervals based on the responses he observed to hypothetical dilemmas presented in the form of stories, concluding that there are three main levels of moral reasoning with two stages each.

The frst level is that of "preconventional morality," which is divided into an initial stage of orientation to punishment and obedience, where the child decides what is wrong on the basis of what behavior is punished, and a subsequent stage of individualism, instrumental purpose and exchange, where the child follows rules when it is in his/her immediate interest. This level is largely related to the moral foundation of authority, which values both respect for the rules established by a moral authority and punishments for moral transgressions. The role and importance of authority fgures and social norms guiding the individual's principles of right and wrong are also established at this level.

The second level is that of "conventional morality" which is divided into an initial stage of mutual interpersonal expectations, relationships, and interpersonal conformism, where those actions that meet the expectations of the family or other signifcant social grouping are deemed to be morally right (directly related to the moral foundation of loyalty). The later stage in this level, that of social system and consciousness, emphasizes that moral actions are those defned by broader social groups (e.g., a nation or people) or by society as a whole (Kohlberg, 1976).

Finally, the third level is that of "postconventional morality," which is divided into an initial stage of orientation by the social contract, where the attitudes of the individual are directed to act in order to achieve the "greater good for the greatest number of people" (i.e. utilitarianism), and a subsequent stage of universal ethical principles, where the individual develops and follows ethical principles through refection and personal choice to determine what is morally right (Kohlberg, 1976).

As with Piaget, Vozzola (2014) also points to the strong and weak points which can be highlighted in Kohlberg's theory. Kohlberg primarily asserts that it is through development that people can construct "a deeper understanding of particular social practices or of more specifc social contexts" in qualitative divisions based on hypothetical dilemmas.

#### *Current Perspectives*

More recently, Saarni (2011) has highlighted the construction of emotional competence as a key milestone in moral development, as a set of cognitive and regulatory skills and goal-oriented behavior that emerges over time relative to the individual's social context. As discussed by Eisenberg (2000), individual factors such as cognitive development and temperament infuence the development of emotional competency, which can also be infuenced by social experiences and learning, including the individual's social relations history and beliefs. Also, emotion regulation habilities may mediate how emotional intuitions impacts moral judgment and reasoning. Thus, some skills of emotional competence, described above, are: (i) ability to discern and understand others' emotions based on situation and expressive clues; (ii) the capacity for empathy and sympathy involving the emotional experiences of others; and (iii) ability to soften the intensity of aversive and distressing emotions using self-regulation (Eisenberg, 2000).

Saarni (2011) also states that child's relationship with their caregivers is characterized by the initial context in which there is the unfolding of the emotional life of the child, causing this relationship to structure the child's life for the development of emotional skills and future relationships social rights (see also Graziano et al., 2010; John & Gross, 2004). The same author goes on to say that a safe bond between the caregiver and the child leaves the child free to explore the world and engage with peers, since an insecure or unstable attachment is associated with emotional and social incompetence, particularly in the areas of understanding emotions and anger regulation. Typically, in relation to the development of emotional abilities, in younger children, the expression of emotions and their regulation are less developed, requiring a greater support and reinforcement of the social environment. The development of these skills does not occur in isolation, and its progression is intricately linked with cognitive development (Eisenberg, 2000; Saarni, 2011).

In this sense, some studies have investigated the infuence of emotional regulation on moral judgment (Feinberg et al., 2012; Lee et al., 2013; Li et al., 2017; Zhang et al., 2017; Helion & Ochsner, 2018). For example, one of the studies pointed out that cognitive reappraisal habit infuences the rigidity of moral judgment, so that individuals who have a high frequency of cognitive reappraisal also have a more liberal moral judgment (Feinberg et al. (2012). In this same sense, another study revealed that the habit of cognitive reappraisal, in addition to being related to less conservative behaviors, is also related to less behavior in support of conservative policies, which demonstrates that this cognitive control has as much infuence on moral judgment as on moral attitudes (Lee et al., 2013).

#### **Conclusions**

Several studies address the relationship between emotion and moral judgment (Pizarro, 2000; Greene et al., 2001; Haidt, 2001; Helmuth, 2001; Haidt, 2003; Koenigs et al., 2007; Moll & de Oliveira-Souza, 2007; Tangney et al., 2007; Huebner et al., 2009; Feinberg et al., 2012; Zhang et al., 2017; Wagemans et al., 2018), sometimes highlighting the duality between faster/intuitive and slower judgments/deontological, others defending the domain that emotions cause in guiding decision-making processes (Haidt, 2012; Greene, 2013). In one way or another, there is a great interest by moral psychologists in studying the relationship between these two phenomena, since this relationship affects areas such as law, politics, public health, and interpersonal relationship processes in general. In addition, emotions are currently being discussed as active processes, no longer as a mere physiological consequence of a given stimulus, highlighting the important role of cognitive processes, such as the regulation of emotion, in modulating the emotional response. In that sense, the specifc assessment of different moral foundations for different ages can contribute to a better understanding of the development of moral judgment throughout the different stages of development. In addition, it is essential to highlight the importance of assessing the development of moral judgment also during adulthood, as well as in different sexes.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 8 Trust in Social Interaction: From Dyads to Civilizations**

#### **Leonardo Christov-Moore, Dimitris Bolis, Jonas Kaplan, Leonhard Schilbach, and Marco Iacoboni**

**Abstract** Human trust can be construed as a heuristic wager on the predictability and benevolence of others, within a compatible worldview. A leap of faith across gaps in information. Generally, we posit that trust constitutes a functional bridge between individual and group homeostasis, by helping minimize energy consumed in continuously monitoring the behavior of others and verifying their assertions, thus reducing group complexity and facilitating coordination. Indeed, we argue that trust is crucial to the formation and maintenance of collective entities. However, the wager that trust represents in the face of uncertainty leaves the possibility of misallocated trust, which can result in maladaptive outcomes for both individuals and groups. More specifcally, trust can be thought of as a scale-invariant property of minimizing prediction error within ascending levels of social hierarchy ranging from individual brains to dyads, groups and societies, and ultimately civilizations. This framework permits us to examine trust from multiple perspectives at once,

L. Christov-Moore (\*) · J. Kaplan

D. Bolis

International Max Planck Research School for Translational Psychiatry (IMPRS-TP), Munich, Germany

Munich Medical Research School (MMRS), Dekanat der Medizinischen Fakultät, Ludwig-Maximilians-Universität München, Munich, Germany

L. Schilbach

Independent Max Planck Research Group for Social Neuroscience, Max Planck Institute of Psychiatry, Munich-Schwabing, Germany

LVR Klinikum Düsseldorf/Kliniken der Heinrich-Heine-Universität Düsseldorf, Düsseldorf, Germany

Ludwig-Maximilians-Universität, Medical Faculty, Munich, Germany

M. Iacoboni

Brain and Creativity Institute, University of Southern California, Los Angeles, CA, USA

Independent Max Planck Research Group for Social Neuroscience, Max Planck Institute of Psychiatry, Munich-Schwabing, Germany

Department of Psychiatry and Biobehavioral Sciences, Ahmanson-Lovelace Brain Mapping Center, Brain Research Institute, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA

relating homeostasis, subjective affect and predictive processing/active inference at the individual level, with complexity and homeostasis at the collective level. We propose trust as a paradigmatic instance of an intrinsically dialectical phenomenon bridging individual and collective levels of organization, one that can be observed in daily experience and empirically studied in the real world. Here, we suggest collective psychophysiology as a promising paradigm for studying the multiscale dynamics of trust. We conclude with discussing how our integrative approach could help shine light on not only the bright but also the dark sides of trust.

**Keywords** Trust · Social interaction · Empathy · Homeostasis · Emergence · Scale invariance · Active inference

#### **Introduction: A Broken Leg and a Stranger**

You have fallen and broken your leg on the stairs outside your apartment. Your phone is inside and you're immobile, in agony. In that moment a stranger walks up, seemingly concerned. You look them in the eyes. They seem kind. You take a chance and ask them if they can quickly go up to your apartment, grab your phone from the coffee table, and call the emergency line. They agree to help, and you breathe a sigh of relief. In the face of your limitations in that moment, you have made a very specifc wager: that they will not defy your prediction of their behavior, and particularly, that they will not do so in a way that runs counter to your interests. In other words, not only are you wagering that they will not surprise you by donning a helmet and begin singing opera, you are specifcally wagering that they will not betray you by taking advantage of the situation, e.g., stealing everything and walking away. You have decided to trust them.

Trust, this bet on predictable benevolence, is a social heuristic (another word for shortcut) that enables us to navigate a world about which we have limited direct information and within which we have limited agency. Trust is a leap of faith across gaps in information, reducing the energy we would otherwise spend independently verifying others' beliefs, intentions, and actions (Braynov, 2002), and performing all the actions necessary for our survival alone. Trust allows us to plug others' hypotheses into the gaps in our information and behave in belief and in action as if those hypotheses are accurate. Doing so in no way ensures that the information is accurate, so trust is ultimately a wager, a simulation of the world that we treat as real. This reduces the energy cost associated with uncertainty, facilitating cooperation, community, and group effciency (Lewis & Weigert, 1985; Luhmann, 1979). Due to its profound material and subjective advantages, it can be considered a form of social capital (Bachmann, 2001; Morgan & Hunt, 1994; Fukuyama, 1996; Zheng et al., 2008). By virtue of these multiple capacities, it is a foundational pillar of human interaction, ranging from pairs of people (i.e., dyads), to families and societies, all the way up to the global web of socioeconomic relations that undergirds our civilization (Misztal, 1996; Zak & Knack, 2001).

In this chapter, we will dissect the principal components of trust, the role of trust in reducing individual prediction error and group complexity (Lewis & Weigert, 1985) in and through social interaction (cf. Bolis & Schilbach, 2020a; Ramstead et al., 2018). More concretely, we will link trust to fundamental properties of predictive processing and homeostasis and, in doing so, formalize and situate our framework within the contemporary theoretical landscape. Last, we will discuss the potential maladaptive outcomes of trust formation and maintenance and possible insights on how to avoid them. This constitutes a novel framework with which to understand and study trust empirically, relating it inwards to phenomenology, cognition and affect, and outwards to informational and energetic properties of groups, within a global framework of homeostasis and free energy minimization (Table 8.1).

#### **Conceptions and Components of Trust**

Classical research on trust describes its cognitive, affective, and behavioral components, while primarily approaching it via two core accounts: from the psychological perspective, the disposition to trust is conceived as a trait difference dependent on properties of the other, such as honesty, status, benevolence, etc. From the behavioral perspective, trust is modeled as a risky but advantageous wager on future reciprocity, primarily studied through a small set of paradigms such as the prisoner's dilemma and the trust game (reviewed in Lewis & Weigert, 1985). Trust can reduced to the following: one agent (the truster) engages in a belief about a future outcome that relies on the behavior of the other (the trustee). This may be voluntary (as when you decide to give the keys to your apartment to a friend) or compelled (as in the case of your broken leg) (Bamberger, 2010; Mayer et al., 1995; McKnight & Chervany, 1996).

From the individual perspective, trust is essentially regarded as a belief in the predictability of a future outcome, whether in the cognitive and social sciences (Lewis & Weigert, 1985), in management science and business (Cui et al., 2018; da Rosa Pulga et al., 2019), or in law. Predictability here means you have well estimated the parts you can "model" and the parts that are chaotic or unpredictable, e.g. the explanatory variables and random error. Predictions about others constitute a cardinal process in social interaction on both an intrapersonal and interpersonal level (Frith & Frith, 2012; Timmermans et al., 2012). Conversely, making oneself predictable, and thus facilitating trust formation, one can help increase the chances of continuing to interact with others (Coan, 2015), while collectively decreasing metabolic cost (Theriault et al., 2021). We don't just trust those we fnd predictable, we also seek to make ourselves more predictable to those we trust.

However, predictability alone is not suffcient to capture what we mean when we say "I trust you." I can trust that person X will betray me if I let down my guard. Person X in this account is predictable, but I cannot say that I trust them if I am sure


**Table 8.1** Glossary of terms

Adapted from Bolis et al. (2017)

that they will betray me given the opportunity. To say that I trust them (rather than trust that they will do as expected) conveys a belief that they are predictably benevolent. Philosophers have addressed this by distinguishing trust from reliance, where trust can be betrayed while reliance can only be disappointed (Baier, 1986, 235). I may rely on a clock to give the time, but I do not generally feel betrayed when it breaks. Trust is thus different from reliance in that a truster accepts the risk of being betrayed. In contrast, I can feel confdent in the competence of another without being invested in their benevolence (Nooteboom, 2017).

In addition to predictability and benevolence, we argue that trust requires the trustor's implicit belief that the trustee's conception of benevolence and perception of reality are compatible with their own. I can fnd a person's behavior both predictable and a clear expression of their benevolence according to their worldview but still not trust them because that expression of benevolence is incompatible with mine. Take the case of a fanatic who offers to ceremoniously sacrifce me in order to reunite me with my creator. I may believe that they are sincerely benevolent, but I cannot say that I trust them, because human sacrifce is not consistent with my concept of benevolence. In addition, benevolence is often expressed through the transfer of accurate information. Trust requires that we be able to generally believe another's assertions about the world. Take the case of a person or group of persons signifcantly deviating from an intersubjective consensus in a given spatiotemporal and, thus, sociocultural context, e.g., a schizophrenic or a person under the infuence of strong hallucinogens. In addition to qualms about their benevolence, I cannot use the information they provide me about the world, unless I have access to their perspective of reality. Thus, trust in our estimation entails predictability of benevolence within an interpersonally compatible conception of reality, including one's values within that reality. In this light, trust is fundamentally a dialectical relationship between individual properties (cf. predictable benevolence) and collective properties (cf. shared reality and values).

The formation of social bonds whether they be romantic, friendly, or transactional relies on the formation and building of trust. Not surprisingly, evidence suggests that the ability to trust and the state of trusting are not only benefcial to the survival of an organism, they are also important for subjective well-being (DeNeve & Harris, 1998). To feel trusted and to trust others feels good, fosters calm, and is important for healthy, warm relationships (DeNeve, 1999). At the other extreme, the betrayal of trust is deeply traumatic and diffcult to recover from, whether in relationships or societies (Lewis & Weigert, 1985). In Dante's Inferno, the lowest circle of hell was reserved for traitors. For what greater sin exists within the human mind and human mythology than betrayal? The emotional experience of being betrayed is deeply traumatic and diffcult to recover from, as it constitutes a massive prediction error in the human psyche, a defance not only of current predictions but of an entire history of belief. Indeed, the powerful subjective experience of trust and its betrayal, and this experience's ubiquity and power within human art and mythology, points to its pivotal role as a group-level homeostatic function.

Despite manifesting as a compelling component of individual experience, trust is an inherently social construct (Lewis & Weigert, 1985; Searle, 1995), much like language, power, surveillance, and accountability (Gerck, 1998a, b). Note here that this extends to trust in one's concept of self as the third person, e.g., "I don't trust myself to drive drunk." Trust acts as a group-level adaptive mechanism that makes social life more predictable and less dangerous, thereby facilitating coordination (Lewis & Weigert, 1985; Tomasello, 2014). Inversely, the absence of trust inhibits the formation of community, impedes cooperation and in doing so increases instability and entropy within the group, reducing its effciency (Braynov & Sandholm, 2002; Lewis and Wiegert, 1985; Zak & Knack, 2001). Along similar lines of thought, modern economics considers trust as an economic lubricant, reducing the cost of transactions between parties, minimizing risk and uncertainty, as well as enabling new forms of cooperation and, at the macro level, generally facilitating business, with hypothesized macroeconomic effects even on indices such as GDP and infation (Morgan & Hunt, 1994; Singh, 2012; Zheng et al., 2008).

Let us imagine an illustrative example of trust's role in groups (cf. Tomasello, 2014): In a primitive world where relatively small prey was abundant, each individual anthropoid could hunt and take care of their satiation needs independently. Imagine now an abrupt ecological shift after which small prey (e.g., chickens) has been substituted by signifcantly bigger prey (e.g., bears), which is impossible to hunt individually. In this case, individuals have two options: either they continue going independently for the limited small prey or learn to coordinate in order to effectively catch the bigger prey. One such strategy could have been found in an individual scaring a bear which, trying to escape, falls in the trap of several other anthropoids. Here, crucially, the frst anthropoid in our hypothetical scenario should trust the group of the others will provide a fair share of the food at the end of the day.

At a certain point of evolutionary history, humanoids were potentially presented with a fundamental dilemma of trust: I either act independently, risking the unavoidable case of running out of suitable prey; or trust and coordinate with others in order to survive both as an individual and as a group, risking being potentially betrayed. However, the heuristics we use to establish trust, such as perceived similarity, group affliation, or charismatic persuasion, can result in emergent, maladaptive outcomes. We may trust groups whose behavior, while coordinated, may be ultimately detrimental to ourselves and others or groups whose own internal dynamics may be ultimately self-destructive. Indeed, throughout the evolution of organisms and superorganisms (Kesebir, 2012) of greater and greater complexity, there must have existed a dynamic balance between the opposing needs to minimize prediction error by trust and, on the other hand, to withhold and restrict trust because of the high risk it implies.

The Bayesian perspective construes the brain as an organ that calculates and maintains expectations about subsequent events in the environment or within the body, by combining prior experience (priors for short) and newly sensed or posterior information. Crucially, the more confdence (i.e., precision) is placed on the validity of existing prior expectations/beliefs; the less these are updated in the face of new incoming information (i.e., evidence). Trust in this light operates primarily upon gaps in information or points of uncertainty, allowing us to place high confdence in one's (or trusted others') priors in the absence of data. After all, if one possessed absolute knowledge, trust would be unnecessary.

Trust reduces the energy used in mitigating uncertainty around incomplete information and lessens the impact of conficting data. Trust can thus be viewed as a complementary mechanism of attention. Attention putatively allows for reallocation of monitoring through selectively keeping precision (confdence) of incoming

**Fig. 8.1** Trust minimizes apparent prediction error and facilitates the interpersonal sharing of priors

information high or low, in order to attend or not, respectively (cf. Mirza et al., 2019; Friston, 2009). Trust acts by reallocating attention as if prediction error was low (Fig. 8.1), through the selective tuning of precision/confdence toward the trustee, resulting in a reduction of energy consumption. Trust is a wager driving informational confdence in prior beliefs about blind or occluded spots in the social and material world—the unseen priors driving others' observed behavior and our priors about the unseen world that trusted others provide.

Let us unpack this core idea intuitively. When introducing their Bayesian model of selective attention, Birza and colleagues describe the example of the lost red pen (Mirza et al., 2019). Imagine you are at your offce and you have lost a red pen. How will you search for it? Or to put it more technically, how will you sample the environment? A naive robot with high luxury of time might choose to deploy a serial search, sequentially scanning all possible positions in the room until the red pen is found. However, in real life, humans deploy certain heuristics in order to optimize their sensorimotor processes in space and time and eventually the chances of maintaining their own existence. Searching for a red pen, arguably, might not be crucial for one's own life these days, yet quickly spotting the red apple on a tree and, thus, avoiding eating a poisonous fruit or encountering a wild predator might have been a life-saving process in prehistoric times. In this scheme, attention modulates the expected precision (confdence), so that task-irrelevant observations have less expected information gain, resulting in an agent less motivated to actively seek for them (Mirza et al., 2019).

In this light, the features of the environment employed in searching for the red apple have been suitably modulated in order to ft the purposes of the search. By selectively increasing the precision of features like the redness or the curving shape of the apple, these features will become more salient in the belief updating process, as high precision (confdence) information weighs more in Bayesian information integration.

Now imagine a stranger offers you a fruit that looks like an apple, assuring you it is delicious. Will you eat that fruit? A naive approach implying ample luxury of time may include meticulously researching the stranger's past and slowly forming an understanding of their feelings, intentions, and beliefs. However, in real life, one is typically forced to make a decision quickly based on incomplete information. If you choose to trust this person, you choose to consider their offer as a predictable, benevolent action from a person sharing similar values with you, and as such not worthy of further scrutiny. You will take the apple and skip the time and energy necessary for verifcation of its nature. Technically speaking, one is choosing to increase the confdence of the priors about the trustee, allowing one to selectively disregard or weigh less other types of (even contradictory) information in a contextspecifc manner, regarding the trustee or the information coming from them, as well as the specifc goals and contingencies. This wager saves energy but carries the risk of both being incorrect and not learning from it, with potentially disastrous consequences. In summary, we construe trust as a selective increase of prior beliefs' precision/confdence about the trustee predicated on a wager on the predictability, benevolence and interpersonal similarity of the trustee.

#### **The Experience and Function of Trust: Affect and Homeostasis**

Think of someone you trust. What is trusting them like? What does it mean for your relationship with them? Who has not experienced the deep comfort of feeling trust in another human being? Or the heartbreaking, scarring pain that comes from experiencing trust betrayed? Trust's affective content is undeniable in daily life (Lewis & Weigert, 1985). Classical accounts of trust conceive of trust in terms of individual predispositions toward trusting, judgments based on qualities of the trustee, and as a rational cost-beneft analysis, in which trust constitutes a risky but advantageous wager on future benevolence (Lewis & Weigert, 1985). These accounts, however, largely overlook the subjective experience of trust and do not analyze what the functional role of this affective component may be, or from where it may arise.

Contemporary theories of affect posit that feelings are the extrusion into conscious experience of homeostatic processes that arise from the interaction of the body, brain, and exterior milieu (Damasio, 2018; Damasio & Carvalho, 2013). Homeostasis is here not taken to mean a stable unchanging state but rather a dynamic process aimed at minimizing the prediction error or disconnect between one's expectations of the world and one's body, and information derived from the world and one's body (cf. Seth, Suzuki, & Critchley, 2012). The feelings associated with a process such as trust (especially their valence and intensity) are informative in understanding the importance and role of that process. The intensity of trust's formation and betrayal suggests from a homeostatic perspective that there is something so profoundly useful about trusting relationships that we are evolved to seek, enjoy, foster, and preserve them. Indeed, the powerful feelings associated with trust (cf. Lewis & Weigert, 1985) drive correspondingly striking patterns of behavior and belief: if you trust someone in the extreme, you assume that they will act in your interests despite ample motivation to do otherwise. You assume that they are not misleading you even if the content of their words is unbelievable and assume that there is a rhyme and reason to their actions even if they appear nonsensical or deeply unethical.

Correspondingly, the neural correlates of trust appear related to affective processing in general. Judgments of trustworthiness involve a broad array of affective decision-making regions such as the anterior cingulate, frontal lobe, caudate, insula, and amygdala (Todorov et al., 2008; Watabe et al., 2011; Winston et al., 2002). The amygdala has been thought to be hyperactive in response to specifcally untrustworthy faces (Adolphs et al., 1998; Baron et al., 2011; Santos et al., 2016) although some data suggests the amygdala may actually have a non-linear response to the strength of the trustworthiness judgment (Freeman et al., 2014; Said et al., 2009). The assessment of trustworthiness is associated with amygdalar activity (signaling trust or mistrust), as well as connectivity with and activation of regions associated with top-down control of affect like dorsolateral prefrontal cortex, temporoparietal junction, and ventromedial prefrontal cortex (Bellucci et al., 2019).

The maintenance and formation of unconditional trust relationships is associated with activation in mesolimbic reward systems associated with encouraging and reinforcing optimal behaviors (Krueger et al., 2007). When choosing to trust and interacting with a trustworthy individual, activity heightens in the orbitofrontal cortex and caudate, though this activation decreases with age suggesting that trust and cooperation become more of a given than a novel reward (Decety et al., 2004; Fett et al., 2014; Gromann et al., 2014). It is also thought that affective regions such as the insula and anterior cingulate may be hyper-responsive when there is a possibility of betrayal, refecting the strong importance of avoiding misplacing trust (Aimone et al., 2014; Fett et al., 2014).

As social animals, we need trust because it is simply impossible to experience the whole of our social and material world directly. Much of our perception of the world relies on others (cf. participatory sense making De Jaegher & Di Paolo, 2007) and dialectical attunement (Bolis & Schilbach, 2020a, Vygotsky, 1978). Hence, from the perspective of a homeostatic organism that seeks to minimize prediction error about the world, what could be better than having an affective marker of the validity of one's models of others? Individuals within trusting (and hence coordinated) groups expend radically less energy in monitoring the world, getting information and verifying it, predicting the behavior of others, and ensuring their benevolence. Conversely, high unpredictability and a high possibility of malevolence or betrayal within our immediate environment is energy-intensive to monitor and manage, detrimental to homeostasis, highly suboptimal for group effciency, and correspondingly regarded as both stressful and unpleasant (DeNeve, 1999; DeNeve & Harris, 1998; Singh, 2012).

The affective nature of trust supports a crucial role for it in the formation and maintenance of effcient, coordinated groups, from dyads to civilizations (Lewis & Weigert, 1985; Zak & Knack, 2001). Much as the theory of embodied cognition speaks of the brain's cognitive processes as situated within the body (Wilson & Foglia, 2011), we cannot disregard that as the brain exists within the body, the body exists and grows within a society in virtually every instance in our evolution. It has been theorized that we "extend" our cognition via external tools (Clark & Chalmers, 1998), such as our phones. Perhaps our cognition, knowledge, and affect are also extended within our trusted groups. Our knowledge of the world, our reactions to it, and our motivations within it are informed by others we trust, as such trusted conspecifcs are able to compensate for incomplete data and uncertainty, thus achieving coordinated and potentially optimal knowledge, motivation, and affective states that emerge between individuals. The wager of trust consists in our choice of who to include in this collective entity, and thereby our window to the unseen and occluded world.

If our approach to trust is correct, then the social pain of betrayal may refect more than individual homeostasis: it refects "damage" to the collective, a refection into the individual of group homeostatic signals: When I trust you, I'm in a sense saying "Let's form a superorganism, a collective entity, together." This may be why violations of trust are deeply traumatic and have such an enduring impact on subsequent behavior and inference. When betrayed we are saying: "We were a superorganism together. I did not second guess your inferences about the world, or your future behavior towards me, and it was great. As it turns out, I was wrong. My choice of superorganism and the investment it implied was incorrect. What a surprising disappointment." The intensity and valence of the affective response when trust is betrayed suggests that it constitutes a massive prediction error that must be incorporated into behavior and decision making, and remembered about that person, group or institution, lest we risk future negative consequences. Via this bridge between phenomenology and individual and collective forms of homeostasis, we can draw novel links between the subjective experience of trust and predictive theories surrounding cognition.

#### **Trust as Dialectical Bridge Between Individual Prediction Error and Group Complexity**

We posit that trust is a characteristic example of a phenomenon bridging individual accounts of cognition that emphasize the minimization of prediction error (Clark, 2013) and group accounts of effciency and coordination that emphasize the reduction of complexity (Lewis & Weigert, 1985), unifed by the principle of free energy (cf. Bolis & schilbach, 2020a; Ramstead et al., 2018) (Fig. 8.2). Leaning on dialectical (e.g., Vygotsky, 1978) and Bayesian (e.g., Clark, 2013; Friston, 2013) accounts of cognition and action, we also regard trust as a core process of dialectical attunement. Dialectical attunement construes human becoming as the dynamic interplay between (social) internalization and (collective) externalization in and through culturally mediated social interaction (Bolis, 2020; Bolis & Schilbach, 2020a). Here, internalization is thought of as the co-construction of bodily hierarchical models of the (social) world and the organism. Externalization is taken as the collective transformation of the world.

Our notion of internalization is largely based on predictive processing conceptualizations (Bolis, 2020; cf. Friston, 2013; Clark, 2013). Predictive processing has been defned as a hierarchical bidirectional process through which an organism adjusts itself in order to "optimally" predict environmental and bodily regularities. In brain function, predictions are continuously generated and propagated from higher levels of the neural hierarchy to lower ones in an attempt to explain away prediction errors, i.e., the discrepancy between incoming information and generated predictions. On the other hand, prediction errors are propagated from lower levels

**Fig. 8.2** At a collective level, trust facilitates effcient group behavior, reducing group complexity and overall energy consumption

of the hierarchy to higher ones in order to suitably readjust the organism. The ultimate goal of such a process is to minimize prediction error as precisely as possible, through processes such as perception and learning. Such hierarchical structures should be considered as collectively shaped. First, we dynamically embody others in and through social interaction, shaping each other's hierarchical structure (Bolis, 2020; Bolis & Schilbach, 2020a), and second, such structures might even be socially extended into interbodily confgurations (Ramstead et al., 2018). The structure and culture of social groups are two possible avenues to achieve such confgurations.

Yet, organisms such as humans are not passive observers of reality who merely try to adapt to it (cf. Bolis, 2020; Bolis & Schilbach, 2020a). Organisms continuously interact with their world (including their own body), adjusting it according to their prior expectations (cf. Friston, 2013; Clark, 2013). For instance, the body temperature tends to fall behind expected values in extremely cold environments. Bodily tremor, lighting a fre, or choosing to go in a warm space typically reverses such a bodily temperature decrease, helping keep it within well-defned bounds. Processes of actively controlling the body and the environment in order to minimize prediction error have been described as active inference (Friston, 2013; see also Clark, 2013). However, such processes should not be exclusively attributed to the individual. For instance, "architecture and technology can be viewed as a collective effort for reducing overall uncertainty by transforming the environment according to bodily and interpersonal expectations" (Bolis, 2020). In a nutshell, humans actively co-construct and co-regulate—in interaction with other organisms—their ecosocial niches, with the ultimate aim to facilitate survival of not only the individual but also the social group and the species as a whole (cf. dialectical attunement). Here we suggest that multiscale processes of trust between individuals and groups are cardinal to such an endeavor.

Such processes of prediction error minimization can be thought of as processes of complexity reduction subserving life. According to the free energy principle theory (cf. Friston, 2013), life is thought of as a natural process leading to a restricted number of states. For instance, a human being, conceptualized here as a system, typically inhabits a well-defned range of states across several dimensions, such as temperature, size, and body structure. A human corporeal system maintains an order of a certain extent. Such a process of life implies a tendency to resist the second law of thermodynamics of keeping disorder (entropy) as low as possible. As entropy can be mathematically defned as the mean value of surprise over time, a living system needs to also keep surprise as low as possible. However, the precise calculation of surprise is not accessible to a living system, as it should be aware of the dynamics of all possible states of a given world. Therefore, in practice, an upper bound of surprise (i.e., free energy), as opposed to the exact value of surprise, is kept as low as possible. In turn, free energy minimization can be cast as prediction error minimization under simplifying assumptions. Taken together, according to the free energy theory, a living organism, such as a human being, achieves staying alive via effectively minimizing overall prediction error.

Similarly, trust can be viewed as a multiscale process of dialectical attunement via interpersonal prediction error minimization. Put simply, the trustor outsources a part of their active inference and predictive processing to the trustee. By assuming predictability and benevolence of another, within a compatible set of values and worldview, the trustor gives up control over the actions of the trustee. In so doing, it inextricably links interindividual processes of making sense and controlling the world, allowing for potential complexity reduction in broader scales, as well as within each individual (Fig. 8.2). Of course, trust also entails the risk of severe increase of complexity across various scales (from harm of the trustor to disorder of the group) in the case of betrayal or breakdown of trust.

Trust does not solely allow us to minimize uncertainty about conceptions and models of other people, groups, and institutions, it also mediates the extent to which we reduce uncertainty about the world at large and ourselves, by mediating the extent to which we internalize the predictive models and world views of other individuals, groups, and institutions. Trust mediates not only our view of others but also the extent to which we accept them as windows into a world that we can never see and understand in its entirety for ourselves. In other words, within a group, trust operates as a gate on the intergroup sharing of priors as well as a gate on the level of precision applied to those priors. We wish to emphasize that the increased coordination and reduced complexity afforded by intergroup trust lies separate or orthogonal to the potential maladaptive outcomes of that group coordination. Take for example, the collective suicides of doomsday cults. You may well argue that they are capable of incredibly effective coordinated behavior, but the outcome of that coordination and integration is clearly maladaptive. This has applications to understanding both the phenomenon of groups centered around charismatic leaders as well as emergent networks of trust like those that emerge in social media echo chambers.

#### **The Formation and Maintenance of Trust: Bottom-Up and Top-Down Interactions**

Despite the risk to material and psychological benefts it entails (Lewis & Weigert, 1985), trust is often established quickly and without thorough verifcation due to the frequent interactions necessary for social life (particularly in large societies). As trust ultimately lies in a probabilistic belief about the other, not in certainty, there is a question of how much we can "trust trust" (Gambetta, 2000). We often have to rely on explicit and implicit heuristics that allow us to quickly attribute trust that emerge in the intersection of top-down and bottom-up processes, all of which hinge on perceived similarity.

From the top-down, explicit direction, the predominant marker of trust is group identity as manifested in markers of group affliation (Platow et al., 2012) and stereotypes (Foddy et al., 2009). In practice, these provide the quickest markers and indicator of trustworthiness. From the bottom-up perspective, there are both active and passive forms of implicit trust formation. On the passive end, people who are viewed as more similar to oneself are more likely to be empathized with, trusted, and vice versa (DeBruine, 2002, 2005). Conversely, similar neural responses to naturalistic stimuli in audiences predict affliation (Parkinson et al., 2018). On the active end, enforced or created similarity, either through mimicry or through joint action and cooperation, can also create a sense of affnity that translates into trust, empathy, and subsequent benevolence (Chartrand & Bargh, 1999). Both of these paths (top-down and bottom-up) point to trust as a bridge between individual phenomenology and homeostasis, as well as group formation and maintenance.

This can be observed in the bidirectional interaction between implicit and explicit forms of coordination, whether affective, somatomotor, or behavioral: Studies have found that mimicry of behavior can create a sense of affliation and trust, and these are in turn associated with increased behavioral, affective, and cognitive coordination (cf. "The Chameleon Effect", Chartrand & Bargh, 1999). Conversely, neuroscience has found that subjects' pain centers become active when observing another in pain, a phenomenon which is considered a marker of empathic concern (Reviewed in Lamm, 2011). In fact, not only my perception of the pain of the other, but even my very feeling of pain is socially modulated, being dependent on both embodied social factors and personal attachment styles (Fotopoulou & Tsakiris, 2017). Curiously, when subjects are induced to distrust a mock subject (after observing them engage in traitorous behavior), this vicarious pain response is diminished (Hein & Singer, 2008). This diminished response is also observed when observing members of a perceived out-group experience pain (Hein & Singer, 2008), suggesting that trust is a mediator of coordination, even at a basic somatomotor level, modulating the extent to which the perceived internal states of another will impact my own. Coordination is the key word here: when people trust each other and begin to resonate with each other, they do not necessarily solely mimic/imitate each other. Rather they become part of a coordinated group; they plug into a collective entity linked by common priors and shared mutual confdence, with downstream effects on their internal states. Put simply, you internalize the entities around you, and they internalize you (cf. Bolis & Schilbach, 2020a). Our brains become less independent from one another when we trust each other.

An illustrative example of extremes of trust formation and its manifestation in coordination and optimality is in the training of groups of soldiers into platoons. From the start, they are stripped of other affliations, allegiances, and markers, and generally without explanation induced to engage in coordinated motor behavior, such as simultaneous and repetitive behavioral exertions and utterances. They also undergo similarly harrowing painful experiences and suffering. These early diffcult experiences are often cited as the foundation of the later sense of affliation and trust to an extent rarely observed in the civilian world. From the top-down, soldiers are encouraged to strip themselves of group affliations and identities and submit to a common identity and a common goal and unifying mythology, as well as a belief in their complete interdependence and complete reliance on one another for their very survival. The result, when successful, is a degree of coordination, effciency, and ability to quickly align and pursue common goals that are virtually unparalleled in daily life. In all instances, there is a convergence of shared experience, coordinated behavior, repeated interaction, as well as a shared worldview and identity.

Whether emerging from explicit or implicit sources, trust either consist of—or at least uses as—its prime heuristic, similarity. Perceived similarity is correlated with trust when controlling for other variables, and engaging in behavioral synchrony or coordination with others can foster a sense of affliation and trust (DeBruine, 2002, 2005; Chartrand & Bargh, 1999). Similarity here acts as a heuristic element. By its virtue, I can make fewer extraneous assumptions about behavior, as well as develop models of others using my own priors. It should be clear that similarity in this case does not simply refer to the static, superfcial identifcation of similarity, as in a photograph, but to the whole action set of a person, their behavioral cadence and the idiosyncrasies of their movements and expressions, which are refective of their internal drives, preferences, and values. Perhaps through the practices of institutionalized as well as spontaneous mimicry, coordination, and joint attention, there is also fostered a sense of a shared worldview. This manifests at the cognitive level but more importantly, at the level of values, underlying drives, points of attention, reaction patterns, affective trajectories, and also less independent internal states by virtue of self-other resonance (e.g., mirroring, contagion, mimicry).

Theories on attachment revolve around trust. Whether the caregivers are mostly predictable in meeting needs leads to either trusting relationships later on or something maladaptive like the need to seek absolute security, etc. (Mikulincer, 1998). Defcits of trustfulness and the ability to form trusting relationships imply defcits in forming parts of coordinated groups, because one's somatomotor and affective integration into group state might be ineffcient. The possible exception to this trend is in affliation with large-scale group identity in which there is less implicit face to face, bottom-up somatomotor/affective interaction, e.g., ideological camps from a computer screen. This implies that people with emotional, affective problems and trust issues interpersonally may still be able to form part of mass movements. Hence, the authoritarian loner archetype described in works such as Eric Hoffers' *The True Believer*: disaffected loners or self-perceived losers/exiles who are fertile terrain for recruitment into fanatical ideologies and associated tribal identities.

Interpersonal traumas and breakdowns in relationships can be reduced to or at least related to a violation or erosion of trust (Lewis & Weigert, 1985). A current danger relevant to our subject is the breakdown in shared realities in large-scale societies, which has occurred in large part through social media algorithms through which people are "siloed" into only viewing content that already conforms and encourages their pre-existing beliefs, encouraging disparate but more importantly incommensurable realities, thus irrevocably destroying the capacity for trust. If I perceive someone to have an incompatible concept of benevolence—one so incompatible that it affects my perception of their predictability—then I am incapable of trusting them and hence incapable of forming part of coordinating groups with them, a phenomenon which you can see crystallized in the disruption of work forces, in teams, on ideological grounds and at a larger scale in the fragmentation of societies in periods of civil war.

We should not solely focus on the loss of the positive effects of trust within groups: the loss of coordination, the loss of empathy, or the loss of affliation. We must also examine the gain of negative properties and outcomes, beyond dislike or disagreement, down to a dysfunction in empathy and associated coordination at the somatic motor and affective level. The loss of trust and affliation may modulate our subconscious affective and somatomotor processing such that when confronted with members of a distrusted group, their suffering has a reduced power to move us and motivate sympathy and compassion. This could hypothetically make it more likely for harm to be inficted by otherwise typical people. For instance, dehumanization of the other (an ultimate form of trust disruption) through language coupled with reinforcement of in-group coordination (through rallies, synchronized behaviors, and symbols) has been deliberately deployed for enabling subsequent mass atrocities (cf. Scarry, 1985).

#### **Empirical Implications and Future Experiments**

Due to conceptual and methodological constraints, research of trust has largely focused on either the individual or the sociological (reviewed in Lewis & Weigert, 1985). Here, we emphasize the importance of studying intrapersonal and interpersonal processes in their inherent interrelation, as they unfold during social interactions and beyond (Bolis & Schilbach, 2017, 2020a, b; Bolis, 2020; De Jeagher & Di Paolo, 2007; Dumas et al., 2014, 2020). The concept of trust as a bridge between individual prediction error and group complexity (Fig. 8.2) is intended to spur novel work. To this end, we suggest empirically studying the links between the phenomenology of trust, the behavioral and neural correlates of minimization of prediction error, as well as the complexity and effciency at the group level. As we discussed throughout this chapter, trust can be thought of as lying at the dynamic intersection of the individual and the collective, entailing both bottom-up (somatomotor and affective forms of coordination, on one hand) and top-down (contextual and reputation-based factors) processes. Combined, trust can thus be studied as a single interconnected construct. Here, we envision a research line which will elucidate the links between trust processes across scales, extending from implicit behavior and contagion, all the way up to conscious phenomenology and further up to patterns of collective coordination in brain and behavior. In what follows, we describe an experimental framework, namely collective psychophysiology, as well as an analysis scheme, namely multi-level analysis of intersubjectivity that could help us do so (cf. Bolis & Schilbach, 2020a).

Traditionally, psychophysiology as a research paradigm has enabled the empirical investigation of the interrelation between physiological and psychological processes, offering important insights about the mechanisms at the level of the individual. However informative this kind of endeavor may have been, the inherent dynamics of social constructs, such as trust, will remain largely unexplored until dynamic interpersonal processes are systematically considered, as, for instance, (social) cognition might be fundamentally different when we really interact with others (Schilbach et al., 2013). In fact, it has been argued that the most important experience of the other stems from the archetypal situation of face-to-face social

interaction, while all other cases remain mere products of it (Berger & Luckmann, 1967).

Building upon empirical frameworks of interpersonal research (e.g., Bolis & Schilbach, 2020a; Dumas, 2011; Froese et al., 2015; Koike et al., 2016; Montague et al., 2001; Schilbach et al., 2013), the paradigm of collective psychophysiology allows for the empirical investigation and systematic manipulation of real-time social interaction, across various modalities and temporal scales. To give a concrete example of such a framework, in the special case of two-person psychophysiology (Bolis & Schilbach, 2018), study participants sit opposite each other, working on tasks either individually or collectively, while being able to interact via a microcamera communication system. Such a two-person framework allows for the monitoring and systematic manipulation of processes that lie in different levels of organization, from psychophysiology to culture. In fact, via systematically controlling the diversity of the interacting individuals across various dimensions, such as age, culture, social class, or even psychological condition, core interpersonal processes of trust can be put to the test: emerging contextual and interpersonal differences and similarities in social interactions might prove equally, or even more important than individual traits in building and maintenance relationships of trust (cf. the dialectical misattunement hypothesis; Bolis et al., 2017). For instance, it has been shown that it is the interpersonal similarity of (autistic) traits that primarily predicts friendship quality in the general population and not the traits per se (Bolis et al., 2020).

While laboratory studies typically offer excellent experimental control, collective psychophysiology of trust can (and should) be eventually examined beyond the laboratory walls, where it manifests itself, in real-world social life. Two paradigmatic cases of such scenarios could be found in pedagogy and psychotherapy. In fact, in line with pedagogical and clinical insights (cf. Bolis, 2020; Bolis & Schilbach, 2020a; Hendren & Kumagai, 2019; Koole & Tschacher, 2016; Lee, 2007; Ramseyer & Tschacher, 2011; Terrell & Terrell, 1984; Thompson et al., 2004), we suggest that the formation of trust and interpersonal attunement between the educator and the student, as well as the therapist and the patient, might be a frst step of pivotal importance to an eventual educational and therapeutic success. Here, multipersonal neuroimaging and motion tracking could be deployed in order to capture the interpersonal mechanisms of real-time social interactions in classrooms and psychotherapeutic settings (cf. Bolis et al., 2017; Dikker et al., 2017; Lahnakoski et al., 2020; Tschacher et al., 2014), being complemented by digital phenotyping and interactive self-reports (cf. Bolis et al., 2020; Insel, 2017).

It may be fruitful as well to explore phenomena that exemplify our framework and its predictions in action: one possible example is placebo, in which individuals seemingly internalize not just the abstract beliefs about the properties of a drug or procedure/ritual but also the internal states that would be implied by the belief. One could possibly interpret this as the internalization of affective and homeostatic priors, which powerfully suggests that the mitigating factors in the placebo effect are the degree of trust in the physician/clinician/healer/belief system surrounding the treatment, as well as trustfulness as a trait in the patient. This merits study.

Parts of this chapter discussed trust in terms of interpersonal predictive processing and active inference. Here, we suggest moving from exclusively focusing on the isolated individual, toward a multilevel understanding of intersubjectivity and psychopathology (cf. Bolis & Schilbach, 2017, 2020a). It would be also interesting to observe the ways and extent to which the dynamic formation of trust within a group and its neural and behavioral correlates, as well as group-level measures such as effciency and complexity. Not only observing formed groups but rather observing the formation of groups and their maintenance, and how the success or failure of this process is modulated by bottom-up and top-down processes, what their respective contributions are, and also how these factors co-vary with the emergence of group schisms and reformations, induced and spontaneous. This could be fruitfully applied in specifc instances where trust mediates information fow, such as in the classroom setting. Taken together, collective psychophysiology, we suggest, appears as a promising empirical framework for studying trust, enabling both great experimental control and ecological validity (cf. Bolis, 2020; Bolis & Schilbach, 2020a).

#### **Conclusion**

To conclude, we defne trust at its core as a belief and behavior in accordance with predictable benevolence in another within a compatible worldview. Classical theories and studies on trust still bear the mark of behaviorism, with little regard for social interaction dynamics and the informational depth provided by subjective experience, particularly affect, and lacking any computational account of how these relate to brain function and cognition, due also to limitations in experimental methods and conceptual commitments at the time.

Our primary contribution may be to tie the group level reduction of complexity not only to extant concepts of trust at the individual level, mainly individual differences in trustfulness, or game theoretical conceptions of trust, but deeper, to fundamental theories about cognition, specifcally predictive processing and active inference. In relating levels of analysis and inquiry that encompass a conceptual space bridging cortical hierarchies, upwards/outwards to hierarchies of individuals within groups and groups within civilization, we have a tentative framework with which to potentially examine trust at all these levels separately, in relation, and simultaneously. Thus, we provide a novel synthesis and a framework which is capable of empirical implementation and moreover carries with it novel domains in which to study the phenomenon of trust and its manifestations in daily life.

How does the novel synthesis presented in this framework help us? What does it add to our knowledge beyond description? How can we perhaps mitigate the maladaptive outcomes of the heuristics we use for the formation of trust using this model and understand them better? Broadly, our model provides a unifed framework and common language with which to address a massive swath of human experience, with putative scale-free properties. More specifcally, it may focus awareness on the extent to which the perception of similarity or dissimilarity among people can create coordination that extends far beyond superfcial measures of worldview or declarations of goodwill, down to coordinations between somatomotor and affective states.

This highlights the tremendous importance of trust for maintaining and facilitating the wellbeing of society, and the tremendous risk that comes when, due to shortsighted ambitions, leaders, and institutions undergo actions that create distrust between the components of society (such as local population and immigrants), as this goes beyond damaging goodwill to possibly forming a material, concrete antecedent to schisms in relationships, groups, societies, and civilizations. Unveiling the mechanisms of trust formation and breakdown across various domains of human life, ranging from relationships and pedagogy to marketing and politics, may help facilitate group coordination and individual well-being, but also boost immunity against the misuses of trust. Taken together, we hope this work ultimately makes starkly apparent that trust's role and importance is diffcult to overestimate.

From a broken leg and a kind stranger, we have attempted to take the reader inwards to the predictive and homeostatic processes underlying cognition and affect, and outwards, to the formation and maintenance of collective entities. In doing so, we hope we have conveyed that trust is a root fundament of the structure of human life in every form of interaction, from dyads to civilizations.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Part III Clinical Neuroscience**

## **Chapter 9 The Time Has Come to Be Mindwanderful: Mind Wandering and the Intuitive Psychology Mode**

#### **Óscar F. Gonçalves and Mariana Rachel Dias da Silva**

**Abstract** No matter how hard you try—pinching different parts of your body, slapping your face, or moving restlessly in your seat—you cannot prevent your mind from occasionally escaping from the present experience as you enter into a mental navigation mode. Sometimes spontaneously, others deliberately, your mind may move to a different time—you may see yourself running an experiment inspired by the chapter you just fnished reading or you may imagine yourself on a quantum leap into the future as you fantasize about the delivery of your Nobel Prize acceptance speech. Your mind may move to a distinct space, for example, as you replay last weekend's party or anticipate a most desirable date, and may even venture into the mind of another (e.g., as you embody the mind of the author you are currently reading). Our minds can accomplish all this mental navigation in fractions of a second, allowing us to see ourselves or even impersonate different people across space and time. While teleportation and time travel may never be physically possible, our wandering minds are indeed very accomplished "time machines" (Suddendorf T, Corballis MC, Behav Brain Sci 30(3), 2007).

**Keywords** Mind wandering · Perceptual decoupling · Mental improvisation · Mental navigation

Ó. F. Gonçalves (\*)

M. R. D. da Silva Tilburg University Cognitive Science and Artifcial Intelligence Department, Tilburg, The Netherlands

Proaction Lab, CINEICC – Faculty of Psychology and Educational Sciences, University of Coimbra, Coimbra, Portugal e-mail: oscar@fpce.uc.pt

#### **Introduction**

No matter how hard you try—pinching different parts of your body, slapping your face, or moving restlessly in your seat—you cannot prevent your mind from occasionally escaping from the present experience as you enter into a mental navigation mode. Sometimes spontaneously, others deliberately, your mind may move to a different time—you may see yourself running an experiment inspired by the chapter you just fnished reading or you may imagine yourself on a quantum leap into the future as you fantasize about the delivery of your Nobel Prize acceptance speech. Your mind may move to a distinct space, for example, as you replay last weekend's party or anticipate a most desirable date, and may even venture into the mind of another (e.g., as you embody the mind of the author you are currently reading). Our minds can accomplish all this mental navigation in fractions of a second, allowing us to see ourselves or even impersonate different people across space and time. While teleportation and time travel may never be physically possible, our wandering minds are indeed very accomplished "time machines" (Suddendorf & Corballis, 2007).

The concept of mind wandering is still very fuzzy and heterogeneous. As such, distinct authors seldom agree on a common defnition (Christoff et al., 2016; Seli et al., 2018). Despite this lack of agreement, the adoption of a family resemblances view of mind wandering, which embraces the heterogeneity of the phenomenon, is key to further advancing the feld. Here, I defne *mind wandering* as the process by which the mind decenters from the current task and stimulus conditions (Stawarczyk et al., 2011a), moving freely (Christoff et al., 2016) toward multiple space, time, and/or mind positions (Corballis, 2013).

In what follows, and as summarized in Fig. 9.1, we will maintain that frst this wandering process represents our mind/brain's default mode. Second, we describe three distinct but interrelated psychological mechanisms involved in mind wandering—perceptual decoupling, mental improvisation, and mental navigation. Third, we argue that mind wandering has the core function of priming our minds into a psychosocial mode (i.e., a folk/intuitive psychology). Finally, we conclude by suggesting that maybe the time has come to move beyond what Corballis (2015) refers to as the "bad press" that mind wandering has been facing and start acknowledging the benefts of mind wandering.

#### **Minds Wandering by Default**

Let us begin by substantiating the claim that mind wandering constitutes our mind's default mode. Recently, Killingsworth and Gilbert (2010) published in *Science* the results of a real-time large-scale thought sampling report. Thought probes were sent to participants randomly throughout the day by means of a smartphone application, requiring participants to report on the content and nature of their thoughts. An


**Fig. 9.1** Nature and functions of mind wandering

analysis of responses from 2250 adults confrmed that, for about half of the day (i.e., 47%), individuals reported to be mind wandering (i.e., "thinking about something other than what they were currently doing"). Interestingly, mind wandering was transversal to most of their daily activities. In fact, the nature of people's activities explained no more than 3.5% of the between-person variance in mind wandering.

The ubiquity of mind wandering is even more impressive if we move beyond daily wakeful activity. When we add dreaming to the equation, the prevalence of mind wandering increases dramatically. As defended by Fox et al. (2013), dreams, particularly during REM (Rapid Eye Movement) sleep (Mutz & Javadi, 2017), may be considered an extreme form of mind wandering (Andrews-Hanna et al., 2018; Domhoff, 2018), sharing common audio-visual, fantasy, and spontaneous activity (Christoff et al., 2016). This is likely the reason why mind wandering is often taken as synonymous with "daydreaming" (Regis, 2013; Stawarczyk, 2018). As in REM dreaming, waking mind wandering entails a process of spontaneous activity eluding the frontiers of space and time. Curiously, reports of impersonation have been reported in dreams as well in dreaming phenomenology (Schredl, 2019).

By default, both day and night, our minds wander. It is now widely acknowledged that the brain remains highly active during states of mind wandering. Marcus E. Raichle et al. (2001) coined the term *Default Mode Network* (DMN) to refer to a network connecting the medial frontal cortex with the posterior cingulate, precuneus and inferior parietal cortex, shown to be particularly active when individuals are not requested to perform a specifc task in an fMRI (functional magnetic resonance imaging) environment (Raichle, 2010). In such task-negative (default) states, our brains sustain high levels of activity. Metabolically speaking, our brain is a very expensive organ. It spends about 10 times more energy than what would be expected from its volume and mass, such that the majority of its metabolism is associated

with "off-task" and "stimulus independent activity" (70–80%). Conversely, it is estimated that "task-evoked activity" accounts for no more than 5% of the brain's total energy consumption (Raichle, 2010).

There is now abundant evidence that our mind's default mode (i.e., mind wandering) is supported by the brain's DMN (Christoff et al., 2009; Kirschner et al., 2012; Mason et al., 2007). To illustrate, a recent study by Scheibner et al. (2017) confrms such evidence of DMN activity during mind wandering. In their study, participants were instructed to either focus on their own breathing (internal attention condition) or on tones (external attention condition) while fxating on a white cross. Once the cross turned red, participants were requested to report if they were either focused or mind wandering. Core regions of the DMN (medial prefrontal cortex, posterior cingulate cortex, and left temporoparietal junction) were signifcantly more active during instances of mind wandering than when participants reported being focused (either externally or internally). Relatedly, Stawarczyk et al. (2011b) also found that, when contrasted with being on-task, mind wandering was associated with clusters of increased activity in core DMN nodes (e.g., medial prefrontal, posterior cingulate, inferior parietal lobe). However, in addition, this activity was also evident in extended nodes of the DMN (e.g., parahippocampal cortex; inferior and medial temporal gyrus), indicating that core regions of the DMN interact with subnetworks, including the medial temporal lobe subsystem and the dorsal medial subsystem. Meta-analyses using Neurosynth (http://neurosynth.org) indicate that, while the core DMN nodes are engaged in self-referential processes, the medial temporal and dorsal medial subsystems are engaged in episodic memory and social cognition, respectively (Andrews-Hanna et al., 2014). As such, different DMN subsystems seem to be supporting mental processes that are prevalent during mind wandering (i.e., self-referential, episodic, and social cognitive processes). A recent study by Poerio et al. (2017) confrmed that the connectivity between and within different DMN subsystems supports the multicomponent nature of mind wandering, particularly with regard to perceptual decoupling and memory retrieval. Importantly, nodes of this DMN are often anti-correlated with nodes active during tasks requiring focused attention (Fox et al., 2005). However, we also note that executive networks are also known to be a neural correlate of off-task thinking (Dixon et al., 2017), including mind wandering (Christoff et al., 2009; Kam & Handy, 2014). Activity in these networks may seem counterintuitive, considering their recruitment during task-positive, goal-directed thought. However, recent studies indicate that executive networks also serve to regulate attention back and forth between the external environment and internal thoughts and are similarly recruited during mind wandering in order to sustain an internal train of thought (Christoff et al., 2016). Also an indirect confrmation of the role played by the DMN in mind wandering is the research confrming that extended regions of the DMN are involved during REM dreaming (Fox et al., 2013; Sämann et al., 2011). In sum, mind wandering constitutes the mind as well as the brain's default mode. Different DMN subsystems seem to cooperate, allowing the mind to perceptually decouple and to venture into a mode of mental improvisation and mental navigation.

#### **Mind Wandering Processes**

As illustrated in Fig. 9.2, mind wandering depends on three interconnected processes: perceptual decoupling, mental improvisation, and mental navigation. The process can be triggered either by bottom-up (e.g., perceptual fatigue–Boksem et al., 2006) or top-down mechanisms (e.g., memory retrieval–Baird et al., 2011).

In our lab, we are currently studying the contribution of these three different mechanisms to the mind wandering process. We administered a large-scale questionnaire study in which we asked participants to answer questions concerning individuals' tendency to disengage from the environment as they mind wander, concerning the dynamics and variability of mind wandering thoughts, and concerning the general tendency to mind wander across space and time. Specifcally, items from our *Mind Wandering Inventory* (Gonçalves et al., 2020) were intended to capture the following dimensions:


Moreover, we examined the relationship between trait levels of mind wandering assessed with the Mind Wandering Inventory and state mind wandering probed during a vigilance task (Dias da Silva et al., 2020) and have validated the questionnaire with neurophysiological electroencephalogram (EEG) data (Dias Da Silva, Gonçalves, & Postma, 2022).

**Fig. 9.2** Mind wandering processes

#### *Perceptual Decoupling*

Mind wandering entails at least some degree of perceptual decoupling. While decoupling from the immediate perceptual experience, the individual switches to an internal processing mode. This is illustrated by Smilek et al. (2010), who found that during a reading task, individuals tend to blink signifcantly more when reporting to be mind wandering. In addition, they found that during mind wandering periods, there were a smaller number of ocular fxations, suggesting that direct eye avoidance is associated with the elimination of external stimulation sources and priming of internal processing. Moreover, Bristow et al. (2005) demonstrate that eye blinking is associated with the deactivation of a fronto-parietal network responsible for visual attention. Together, these fndings indicate that perceptual decoupling from the immediate experience represents an important component of mind wandering.

Also supporting the evidence for perceptual decoupling are studies which show that the amplitude of early (P1 and N1) and late (P3) perceptual evoked potentials are attenuated during mind wandering. Since both the P1 and N1 are early ERP components indexing processing during the sensory input stage, their reduction is taken as evidence for an inhibitory effect of mind wandering on external processing (Kam & Handy, 2013; Schooler et al., 2011). However, some recent fndings show that, for low demanding attention tasks, individuals are able to maintain appropriate levels of alert, orienting and executive attention during mind wandering (Gonçalves et al., 2017a, b) without impacting early and late perceptual evoked potentials (Gonçalves et al., 2018b). As such, the effects of perceptual decoupling are dictated by task demands. These fndings indicate that perceptual decoupling is an important, but not an absolute, condition for mind wandering. Individuals with higher executive resources (e.g., working memory) are able to maintain some degree of external processing while at the same time mind wandering (Smallwood & Schooler, 2015).

#### *Mental Improvisation*

In order to understand the function of mind wandering, it is important not only to look at the content but also to characterize the dynamics of thought. By investigating mind wandering over time, we can see that thoughts tend to evolve freely from one topic to the next, sometimes coming back to a core theme. For example, while writing an article, it may suddenly come to mind that there are a couple of emails that need to be answered. That reminds you of the current status of the computer you ordered a few weeks ago. You then recall the conversation with the salesperson. Then you get back to the emails and think about the email with the invitation to visit a foreign lab. You remember your last visit, the dinner you had with your friends and all the fun you had. This memory brings you back to the thought that you have to respond right away to that email. In sum, mind wandering dynamics seems to entail a process of free, but not necessarily random, thought movement. This dynamic of

free movement is responsible for the heightened variability of thought content (Mills et al., 2018). Mills et al. (2018) coined the "default variability hypothesis" to refer to the process by which the dynamics of free movement between thoughts favors the encoding of separate memory episodes and the consolidation of episodic memories into semantic knowledge. It is precisely this variability that distinguishes mind wandering from ruminative thoughts (i.e., the persistence of sticky and recurrent thoughts). In fact, there is evidence that these ruminative thoughts are associated with task-related interference (Dias da Silva et al., 2018). The misclassifcation of rumination as mind wandering has been in large part responsible for the widespread misattribution of negative costs in terms of attention (Hu et al., 2012), executive functioning (McVay et al., 2013) and mood (Wilson et al., 2014) to mind wandering. In contrast, mind wandering defned as a process of mental improvisation has consistent benefts in terms of creativity (Baird et al., 2011), memory consolidation (Mills et al., 2018), and mental simulation (O'Callaghan et al., 2015).

Recently Marron et al. (2018) found that the degree of mental improvisation, as expressed in terms of free association fuency, fexibility, and semantic remoteness during a free association task, is related to an increase in activation of the DMN and a decrease in activation of core nodes from the executive network (e.g., Inferior Frontal Gyrus). Notably, these free association markers were positively correlated with creativity measures but not with intelligence scores. As in theater or music improvisation, our thoughts seem to have a mind of their own, moving freely but often recurring around a specifc theme before departing into a new associative dynamic. Curiously, studies on music improvisation consistently report a deactivation in brain regions responsible for executive functioning (i.e., dorsolateral prefrontal cortex) and the concurrent activation of core nodes of the DMN (e.g., anterior cingulate; Beaty, 2015; Landau & Limb, 2017).

#### *Mental Navigation*

Mind wandering is also characterized by a process of mental navigation. While time travel is most often acknowledged in mind wandering, a mental navigation mode is better illustrated by the existence of a triple de-centration: time de-centration, space de-centration, and mind de-centration. Next, we will briefy address each of these mental navigation components.

Several studies have shown that mind wandering entails a time-travel process. Mind wandering has a remarkable temporal orientation (90%), allowing the individual to navigate between the past (~29%), present (~12%) and, above all, the future (~48%, Smallwood & Schooler, 2015). Notably, about half of the time, the mind wanders to some time in the future. This suggests that, along with an eventual consolidation of past memories, mind wandering frees the individual from the here and now, simulating the future and potentiating autobiographical planning (Stawarczyk et al., 2013).

This time travel is undissociated from the correlative process of space navigation—mentally moving from the "here and now" to "there and then". Similarly to what happens with episodic memory, the mental evocation of past or foreseen contexts is hippocampal dependent (Tulving, 2002). A recent study by McCormick et al. (2018) found that patients with hippocampal amnesia were no different than healthy controls in reporting high levels of mind wandering during quiet restful moments. However, their instances of mind wandering were found to be mostly dependent on semantic knowledge (i.e., closer to ruminative thoughts than mental improvisation), in contrast with the episodic content more typically found for healthy controls. In support of this space navigation process, healthy controls reported having mind wandering thoughts of an intense sensorial quality, particularly concerning the experience of visual scenes.

However, mind wandering is not an uninhabited scenario. That is, during mind wandering, time and space de-centration goes often together with mind navigation. This third type of mind de-centration is characterized by the ability to be able to tune in with others' experiences and move into their minds and imagine how they are thinking, feeling, or behaving. Although there are some cultural and individual differences, individuals often adopt (~50%) a third-person perspective when mind wandering; that is, they see the world from the viewpoint of an outside observer (Christian et al., 2013). Building on evidence from this third-person perspective along with fndings indicating the enrolment of core DMN nodes responsible for social cognitive processes (Davey et al., 2016; Li et al., 2014; Mars et al., 2012; Poerio & Smallwood, 2016), we can assert along with Corballis (2015) that mind wandering may be central for developing a theory of mind (i.e., the ability to identify or attribute mental states in ourselves and others).

#### **Mind Wandering Function: Priming the Psychosocial Mode**

We will now be maintaining that the processes of perceptual decoupling and mental improvisation, with a triple—time, space, mind—de-centration, promote a reorientation of the mind from the current physical reality (i.e., intuitive/folk physics) to a predominantly psychosocial mode (i.e., intuitive/folk psychology). While the understanding of the physical reality predominantly requires systematic thinking, the comprehension of psychosocial phenomena relies mostly on refective, creative, and empathic processes. As a result of daily dealing with physical and psychosocial phenomena, individuals develop a sort of intuitive physics (i.e., folk physics) and intuitive psychology (i.e., folk psychology). On the one hand, this intuitive physics translates the nature and degree of individual understanding of physical phenomena into an individual theory of the world. On the other hand, this intuitive psychology translates our personal understanding of the psychosocial reality into an individual theory of the mind (Baron-Cohen, 1997; Kamps et al., 2017).

As stated before, the DMN supports core socio-cognitive processes involved in developing an individual theory of the mind. The DMN, as a network supporting mind wandering, is also central in our orientation to the psychosocial domain. Curiously, the DMN is anti-correlated with the fronto-parietal (attention/executive) network predominantly involved in processing physical phenomena. This is illustrated in an interesting study by Jack et al. (2013). In their experiment, participants were presented with several problem-solving vignettes, some portraying tasks requiring reasoning about mental states and others requiring reasoning about causal/ mechanical issues. The results indicate that not only the DMN was associated with the psychosocial domain and the fronto-parietal network with the physical domain but also that the activity of these regions was reciprocally inhibited.

This type of folk psychology (versus folk physics) orientation sustained by activating or deactivating the mind's default mode is also illustrated in Simon Baron-Cohen's *Systemizing–Empathizing Theory* (Greenberg et al., 2018). According to his view, people can be allocated to a dispositional continuum ranging from empathizing (i.e., drive to identify another person's emotions and thoughts and to respond to these with an appropriate emotion) to systemizing (i.e., drive to analyze, understand, predict, control, and construct rule-based systems).

Confrming the relationship between an intuitive psychology/DMN association, Takeuchi et al. (2014) demonstrate that empathizing is positively correlated with resting state functional connectivity between different DMN nodes, particularly, the medial prefrontal cortex, the dorsal anterior cingulate, the precuneus, and the left superior temporal sulcus. In contrast, systemizing positively correlates with resting state functional connectivity in an "external attention network" between the dorsolateral prefrontal cortex and the dorsal anterior cingulate cortex.

Indeed, the DMN is active during a range of tasks related to both mind wandering (Christoff et al., 2009; Kirschner et al., 2012; Mason et al., 2007; Scheibner et al., 2017) and theory of mind (Jack et al., 2013; Oliveira Silva et al., 2018; Takeuchi et al., 2014). However, DMN activations alone do not guarantee that both are equivalent. Moreover, correlations do not imply causation. Nevertheless, correlations do provide ground for future research for investigating the manner in which mind wandering states may support such a theory of the mind. As recently shown in a series of studies, it seems that we are by default in some sort of empathizing mode (Oliveira Silva et al., 2018), with DMN activity and connectivity being central to maintaining this state of mind (Esménio et al., 2019a, b). Although research is still underway, it may very well be the case that our default mind wandering state signifcantly contributes to prime this similarly default empathizing/psychosocial mode, helping individuals navigate the socio-emotional world around them (Poerio & Smallwood, 2016).

#### **Concluding Remarks: A Time to Be** *Mindwanderful*

During the last decades, the concept of mindfulness has witnessed a growing popularity (Tomlinson et al., 2018). Even though different defnitions are available for mindfulness (Allen et al., 2012; Kabat-Zinn, 2003; Keng et al., 2011; Moore & Malinowski, 2009), most of the researchers see it as a process of directing the attentional focus to the individual's current experience in the present moment while avoiding thought escape into the past/future. As such, a mindful mind seems to be the opposite of a wandering mind (Schooler et al., 2014). For example, Mrazek et al. (2012) demonstrated that people with high levels of mindfulness report fewer instances of mind wandering and perform better on an attention focus task (i.e., mindful breathing).

Despite some controversy regarding conceptual and methodological aspects in mindfulness research (Van Dam et al., 2018), there is evidence for the benefts of mindfulness in terms of orientating attention to the current physical reality (Posner et al., 2015). In contrast to orienting attention to a physical reality, mind wandering optimizes an orientation to the psychosocial domain. In addition to being aware of the present moment, mindfulness can also refer to the act of being aware of one's own internal thoughts and not just stimuli in the external environment (Ellamil et al., 2016). As such, it could also be that mind wandering and mindfulness represent two ends of the same construct. Therefore, it is necessary to fnd an ideal balance between attention to the external world and our internal thoughts, while also mindfully being aware of the wandering mind and the benefts that might come with it. Now the question remains: How can individuals take full advantage of the benefts of mind wandering in order to facilitate navigation in the psychosocial domain?

Several studies are currently underway, testing whether we can impact mind wandering by modulating specifc neural correlates. For example, building on EEG markers of mind wandering instances, we recently launched a series of studies that explore the viability of different real-time EEG protocols (e.g., SMR⇑Theta⇓; Theta⇑SMR⇓) in improving mind wandering during an attention task (Gonçalves et al., 2018a, b). Other authors are using different strategies to neuromodulate processes associated with mind wandering by using transcranial direct current simulation (Axelrod et al., 2015; Axelrod et al., 2018; Boayue et al., 2020), or even real time MEG and fMRI (Garrison et al., 2013). Although these studies are in their beginning stages, as we evolve in the identifcation of reliable brain predictors of mind wandering, we hope to come up with more reliable methods for detecting and impacting mind wandering (Hosseini & Guo, 2019; Jin et al., 2019).

The ideal balance between mindfulness and mind wandering is still not known (Schooler et al., 2014). However, it seems that in order to facilitate both navigation across the physical and psychosocial domains, individuals may gain an advantage by adopting a *mindwanderfulness* position—a process of strategically switching between mindfulness and mind wandering, in order to respond adaptively to the demands of physical and psychosocial domains (Gonçalves, 2019; Hasenkamp, 2018).

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 10 Social Cognition Development and Socioaffective Dysfunction in Childhood and Adolescence**

**Claudia Berlim de Mello, Thiago da Silva Gusmão Cardoso, and Marcus Vinicius C. Alves**

**Abstract** Social cognition refers to a wide range of cognitive abilities that allow individuals to understand themselves and others and also communicate in social interaction contexts (Adolphs, Curr Opin Neurobiol 11(2):231–239, 2001). According to Adolphs (Annu Rev Psychol 60(1):693–716, 2009), social cognition deals with psychological processes that allow us to make inferences about what is happening inside other people—their intentions, feelings, and thoughts. Although the term can be defned in many ways, it is clear that it must be safeguarded for the mental operations underlying social interactions. The most investigated cognitive processes of social cognition are emotion recognition and theory of mind (ToM), given that a whole range of socio-affective and interpersonal skills, such as empathy, derive from them (Mitchell RL, Phillips LH, Neuropsychologia, 70:1–10, 2015). Theory of mind is an intuitive ability to attribute thoughts and feelings to other people, and this ability usually matures in children in preschool age (Wellman HM, The child's theory of mind. Bradford Books/MIT, 1990), whereas emotional recognition refers to an individual's ability to identify others' emotions and affective states, usually based on their facial or vocal expressions, it is a critical skill that develops early and supports the development of other social skills (Mitchell RL, Phillips LH, Neuropsychologia, 70:1–10, 2015).

**Keywords** Social interactions · Theory of mind · Emotional recognition · Adolescence and childhood

C. B. de Mello

Department of Psychobiology, Universidade Federal de São Paulo, São Paulo, Brazil

T. da Silva Gusmão Cardoso Centro Adventista Universitário de São Paulo, São Paulo, Brazil

M. V. C. Alves (\*) Faculty of Health Sciences of Trairi, Universidade Federal do Rio Grande do Norte, Santa Cruz, Brazil

#### **Introduction**

Social cognition refers to a wide range of cognitive abilities that allow individuals to understand themselves and others and also communicate in social interaction contexts (Adolphs, 2001). According to Adolphs (2009), social cognition deals with psychological processes that allow us to make inferences about what is happening inside other people—their intentions, feelings, and thoughts. Although the term can be defned in many ways, it is clear that it must be safeguarded for the mental operations underlying social interactions.

The most investigated cognitive processes of social cognition are emotion recognition and theory of mind (ToM), given that a whole range of socio-affective and interpersonal skills, such as empathy, derive from them (Mitchell & Phillips, 2015). Theory of mind is an intuitive ability to attribute thoughts and feelings to other people, and this ability usually matures in children in preschool age (Wellman, 1990), whereas emotional recognition refers to an individual's ability to identify others' emotions and affective states, usually based on their facial or vocal expressions, it is a critical skill that develops early and supports the development of other social skills (Mitchell & Phillips, 2015).

Adverse Childhood Experiences (ACEs), such as parental neglect or physical, sexual or psychological abuse, especially in the early stages of development, can have particularly harmful long-term consequences for the consolidation of cognitive, affective, and emotional skills (Herzog & Schmahl, 2018). A systematic review on the associations between early social environment, early-life adversity, and social cognition in major psychiatric disorders found that emotional and physical abuse, neglect, and avoidant attachment styles were the strongest predictors of ToM and emotion recognition defcits, as well as emotional dysregulation (Rokita et al., 2018). Prolonged exposure to events of this nature can lead to brain changes, particularly in circuits involved in regulating responses to stress, confguring the concept of toxic stress (Shonkoff & Garner, 2012).

In clinical or psychopathological contexts, social and affective impairments worsen social disadvantages that many patients face. As an example, in Autism Spectrum Disorder (ASD), ToM seems to be the most impaired social cognition domain (Baron-Cohen et al., 1985), whereas in schizophrenia, social impairments appear as negative symptoms and tend to predict patients' lower mental capacity. Several evidences indicate that core social diffculties in ASD are best explained by defcits in controlled processes, such as ToM, rather than automatic ones, such as those dependent on emotional contagion (Hamilton, 2013). This discrepancy between the two seems to be in line with the hypothesis of empathy imbalance (Smith, 2009). As an example, the avoidance of eye or physical contact frequently observed in individuals diagnosed with this disorder can be explained by an exacerbated affective empathy, which is dependent on emotional contagion. On the other hand, a defcit of cognitive empathy, being associated with ToM, could justify the low performance in tasks with greater demand for mental state assignment.

Social cognition defcits are expressed, for instance, in diffculties to interpret social cues and regulate behavior accordingly, which turns engagement in social relationships especially challenging. In ADHD, there are also failures to recognize emotions, especially anger and fear (Bora & Pantelis, 2016). Contrary to what is observed in autism, however, the diffculties tend to be lighter and get better with age. Thus, defcits in social cognition are early and prominent features of many neuropsychiatric, neurodevelopment, and neurodegenerative disorders (Agnew-Blais & Seidman, 2013). In the most recent edition of the American Psychiatric Association's Diagnostic and Statistical Manual for Mental Disorders (DSM-5, 2013), social cognition has been added as one of the six main components of neurocognitive function, alongside memory and executive control. In addition, the DSM-5 points out that impairments in social cognition often arise as defcient ToM, reduced affective empathy, impaired social perception, or abnormal social behavior.

In this chapter, we discuss social cognition processes in typical and atypical development and its relevance for the understanding of socio-affective disorders in childhood and adolescence. We expect therefore that it will be useful for clinicians, teachers, and perhaps also for parents.

#### **Social Interactions and Theory of Mind in Childhood and Adolescence**

Social interactions depend essentially on the exchange of social and affective cues, which can be verbal and non-verbal (Frith & Frith, 2007). Verbal cues include vocalizations, tone of voice, and speech content, while the most important nonverbal cues that humans use are facial expression, body posture, and eye monitoring. Most of these cues are processed automatically and unconsciously (Frith & Frith, 2007). In this way, regardless of an unambiguous interpretation, we automatically decipher people's emotions through their facial expressions, while also looking for something interesting in the environment from clues such as the direction of others' gaze. Moreover, curiously, we tend to imitate the behavior of people with whom we have a good relationship (Frith & Frith, 2007). In an eloquent way, social cues guide our behavior through an ambiguous and complex world.

In childhood (especially in the frst two years of life), human beings have limited attention and limited working memory. Given these restrictions, one possible question to ask is if babies are able to process complex social information. Sensitivity to social clues seems to develop early, considering that even in early childhood, it is already possible to locate its clues from the orientation of attention in human babies (Michel et al., 2017). In Wu and Kirkham's study (2010), 8-month-old babies were presented with two identical audiovisual events simultaneously in two different locations on one computer screen. The babies' attention was focused on a stimulus that contained a social suggestion (a face saying "Hi, baby, look at this!" and that turned to a target event) or a non-social suggestion (a red square surrounding the

target event). Both stimuli directed attention equally, as measured by the time the babies stayed looking at the events. However, only babies exposed to social cues predicted the location of the signaled events, suggesting that social attention hints shape the likelihood and content of learning about events during childhood.

Reid and Striano (2007) propose that for a baby to react successfully to a social situation, four stages of cognitive processing of the task must occur: (1) the detection of socially relevant organisms; (2) the identifcation of socially relevant organisms; (3) the evaluation of the place of attention and direction of the individual's gaze observed in relation to the child; and (4) the detection of any attention directed to objects or involvement of objects by the observed individual. As a consequence, if the previous stage were successful, then another stage is reached; in the ffth stage, the baby is able to infer the observed objective and/or prepare an appropriate response (e.g., establishing contact).

Reid and Striano (2007) suggest that the detection of biological movement, an early cognitive ability in babies, plays a key role in the detection of co-specifcs and, therefore, in identifying the specifcs of social interaction in stage 1. Stage 2 is possible because babies can identify idiosyncrasies in the observed organisms, for example, discriminating between a familiar and unfamiliar individual. In relation to stage 3, these abilities are supported because human babies are sensitive to the elements of the human face, especially to follow the gaze of the observed subject. In addition, they are able to distinguish whether an adult interacts in a salient manner, providing contingent feedback on social interaction, such as smiles and vocalizations, or interacts in an irregular manner, with a delay in social return (Striano et al., 2005).

Finally, babies are competent in discerning a relationship between a person and an object (Stage 4). In another study, Reid and Striano (2005) found that the direction of a person's gaze affects the coding of new objects in 4-month-old babies. In this study, in a frst experimental condition, the babies saw the face of a woman with new toys being presented on her right and left sides. The woman then directed her gaze to one side, thus capturing a certain object, and as a consequence, she averted her gaze from the other toy. In a second experimental condition, the 4-month-old babies were presented to the two toys again and looked more at the toy that had not been looked at in the frst condition, probably because of the little attention paid to it, and only after that to the toy looked at in the frst condition. This suggests that 4-month-old babies not only follow the adult's gaze in the frst condition but also acquire more information about the object that was the focus of adult attention. It is quite convincing that human babies respond early to social cues, however, an idea valued for social and affective neuroscience is that in order to adequately manage the complex levels of social interaction that characterize our social life, human beings need to develop specifc social-cognitive mechanisms, such as Theory of Mind.

Since Premack and Woodruff's (1978) seminal article "Does the chimpanzees have a theory of the mind?" raised the question of chimpanzees' ability to attribute states of mind to themselves and others, the subject of ToM has become part of human development studies. One of the frst studies of ToM development as a child

was the work of Wimmer and Perner (1983), who pointed to the understanding of false belief as an indicator of preschoolers' ToM. One way to test this understanding is to place the child in a task where he or she has to predict the behavior of a character who has a belief that does not correspond to reality. An example would be to present the child with a common box of chocolates, then ask him/her to open the box to check its contents; After checking, to their surprise, that the box doesn't have chocolates, but colored pencils, some questions are asked to the children, such as if you show a friend the same box of chocolates, and ask him/her what is the content, what do you think he/she would say? Why would he/she say that?

It is expected that children between the ages of 3 and 5 will be able to respond correctly to this type of task, thus revealing an understanding of beliefs about reality which, being personal representations of reality, maybe true or false. However, the ability to understand false beliefs seems to be more the result of an ongoing process of developing skills to assign mental and affective states to others. Therefore, some studies analyzing children's discourse have found that references to desires precede references to cognition (Bartsch & Wellman, 1995; Peterson & Slaughter, 2006).

According to Wellman et al. (2011), the theory of mind develops progressively and sequentially from the child's ability to understand different levels of representation of mental states: (1) various desires (people may have different desires for the same thing), (2) various beliefs (people may have different beliefs about the same situation), (3) access to knowledge (something may be true, but someone may not know it), (4) false belief (something may be true, but someone may believe in something different), and (5) hidden emotion (someone may feel one way, but show different emotions).

ToM's reasoning required by different social situations may involve the assimilation of complex levels of intentionality. For example, Ygor believes (ToM of frst order of intentionality) that Larissa thinks (ToM of second order) that her aunt Marcia wants (ToM of third order) that Ygor supposes (ToM of fourth order) that Larissa wants (ToM of ffth order) that her aunt Marcia believes (ToM of ffth order). This ability of frst- and third-order ToM seems to improve with age (Dumontheil et al., 2010).

Dumontheil et al. (2010) showed that the ability to adopt the point of view of another agent grows from childhood, passes through adolescence, and improves even more in adulthood. Meinhardt-Injac et al. (2020) tested a two-component model of social cognition, social perception, and cognitive social, in a sample of 267 participants between 11 and 25 years of age. In addition, they measured language, reasoning, and inhibitory control as covariables. In the study, adolescents showed a substantial improvement in ToM (social perception and false belief) and covariable measures. An interesting fnding is that the social perception component increased with age, while the socio-cognitive component (false belief) increased with age and covariables measures.

The tasks of false belief can be further developed between adolescence and adult life. In the study by Valle et al. (2015), adolescents performed signifcantly worse than young adults in tasks of false belief involving third-order ToM, but an equal result for second-order ToM. Other components of ToM develop in adolescence as the social knowledge required in tests involving soft lies, forgeries, and strange stories (Maylor et al., 2002; Bosco et al., 2014). Happé (1994) developed a test called The Strange Stories to evaluate advanced mental capacity, which is suitable for adolescents and adults with superior functioning. In the test, participants read short vignettes and were asked to explain why a character said something that is literally not true. Therefore, successful performance requires the assignment of mental states, such as desires, beliefs, or intentions, and sometimes higher-order mental states, such as one character's beliefs in what another character knows.

Using a subset of Happé test stories (1994), Maylor et al. (2002) investigated individuals aged 16–29 when performing advanced ToM tasks in the frst person and noted that participants scored an average of 4 out of an available maximum of 7, without any ceiling effect being achieved in the age group. In the study of Bosco et al. (2014), teen performance improved with age in all ToM scales, which investigate frst-person and third-person ToM, frst- and second-order allocating ToM, and egocentric third-person ToM. However, age differences were consistent between 11 and 13 years and then tended to stabilize between 13 and 15 years. The fndings of these different studies suggest that some, but not all, components of ToM continue to develop into adulthood.

#### **Development of Emotional Recognition and Understanding in Childhood and Adolescence**

Children's emotional knowledge comprises two distinct dimensions: recognition of emotion and knowledge of the emotional situation. Recognizing emotions means that the child can label facial expressions using expressive knowledge of emotions as well as identifying emotions when expressed with verbal labels.

Nine-month-old babies are already able to discriminate between positive and negative emotions (Otte et al., 2015). In the study, 84 babies received emotional vocalizations (fearful or happy) preceded by the same facial expression or a different expression (i.e., fearful vocalization with a happy expression). The data processing of emotional information (event-related potential, or ERPs) revealed that the potentials were distinct for positive and negative emotions, and that the babies dedicated more process capacity to potentially threatening stimuli than to non-threatening ones. Between 18 and 24 months, children are already able to acquire the necessary terms to label basic emotions, both positive and negative (Widen & Russell, 2003, 2010).

According to Pons et al. (2004), children develop emotional understanding from three levels (external, mental, and refective) and use at least nine components for this. The frst level (already found in children of 5 years) is characterized by the understanding of public aspects of emotions, such as situational causality, external expression, and cues that reactivate an emotion. The second level (developed at age 7) is characterized by an understanding of the mental states of emotions, their connection with desires and beliefs, between expressed and felt emotion. The third level (between the ages of 9 and 11) is characterized by the understanding that we can feel different feelings and that they can be contradictory and even morally charged.

Regarding the components, approximately between 3 and 4 years old, children begin to recognize, and name emotions based on expressive clues (recognition component). Thus, most children in this age group can recognize basic emotions (happiness, sadness, fear, and anger) when presented in images. Also at this age, they begin to understand how external causes affect other children's emotions (external causality component). For example, they can anticipate the sadness someone feels when losing a favorite toy. Already around the age of 3 to 5, they begin to understand that the emotional reactions of people depend on their desires (component of desire). They are able to understand that two people can feel a different emotion about the same situation, because they have different desires.

Still, according to Pons et al. (2004), children between 4 and 6 years old begin to understand that a person's beliefs—being false or true—will determine their emotional reaction to a situation (component of beliefs). Also, around this age, between 3 and 6 years, children start to understand the relationship between memory and emotion (memory component). For example, children can recognize that the intensity of an emotion decreases with time and that some aspects of the current situation can reactivate past emotions. Still, in this age group, 4 and 6 years old, children already understand that the emotion expressed can be different from the emotion felt (component of the cover-up). In sequence, between the ages of 6 and 7, they are able to use different strategies to regulate emotions (component of regulation), the younger ones use behavioral strategies, and those over 8 use psychological strategies such as denial and distraction. It is also from the age of 8 that they begin to understand that a person can have various and contradictory emotions (mixed component) to a given situation. Finally, 8-year-old children are able to understand negative feelings resulting from a reprehensible moral action (morality component) (like lying, stealing, and hiding), as well as being morally dignifed (like a sacrifce, or resisting a temptation, or even confessing a mistake).

Emotional recognition improves with adolescence. Functional brain imaging studies during facial emotion recognition tasks have demonstrated an increase in activation and connectivity of frontal and temporal regions from childhood to adolescence (Cohen Kadosh et al., 2011, 2013). This increase in emotional recognition tends to continue into adulthood. The study by Tousignant et al. (2017), showed that adolescents perform worse than young adults in this area of social cognition.

#### **Social and Affective Dysfunctions**

To allow a better understanding of how social cognition impacts psychosocial functioning in children and adolescents, we will focus on the fndings of research that has investigated this domain of cognition in some psychopathological conditions

with the presence of social and affective dysfunctions, such as schizophrenia, ASD, ADHD, and impulse control disorders.

In the last two decades, the domains of social cognition have been intensively studied in individuals with neurological conditions, with genetic syndromes, and in population groups at risk for developing the frst episode of psychosis, such as children of parents with schizophrenia. There is also broad evidence of these neurological conditions such as epilepsy (Besag & Vasey, 2019). Social cognitive defcits are part of the cognitive phenotype of genetic diseases, including deletion syndromes, such as SD22q11.2, and those related to X-Chromosome numerical alterations, such as Turner and Klinefelter (Morel et al., 2018). In a review study, Agnew-Blais and Seidman (2013) found that young people belonging to families at high biological risk for developing schizophrenia, especially siblings and children of individuals with schizophrenia, show defcits in aspects of social cognition, such as ToM and emotion recognition. However, it is not yet known whether defcits in the domain of social cognition follow a pattern of delays over time or are static. A populationbased, prospective cohort study evaluated 7-year-old children at high familial risk of developing schizophrenia or bipolar disorder (Christiani et al., 2019). The authors found signifcant impairments in social responsiveness in children at risk for schizophrenia compared to controls, but not for children at risk for bipolar disorder compared to controls (Christiani et al., 2019). In several population-based cohort studies, children and adolescents who later developed schizophrenia showed premorbid social impairments (Tarbox & Pogue-Geile, 2008; Agnew-Blais & Seidman, 2013). These fndings reinforce the view of schizophrenia as a neurodevelopmental disorder and the importance of identifying in clinical practice differences in the domain of social cognition in children and adolescents belonging to groups at risk for developing schizophrenia.

In ADHD, in addition to classical assessment of executive functions, defcits in the domain of social cognition are increasingly investigated (Mohammadzadeh et al., 2016). In a meta-analysis of studies investigating social cognition in ADHD, it was reported that facial and vocal recognition skills and ToM were signifcantly impaired in ADHD (Bora & Pantelis, 2016). In addition, the authors rated the performance of individuals with ADHD as intermediate between ASD and healthy controls (Bora & Pantelis, 2016). An interesting fnding is that defcits in social cognition appear to occur later in ADHD than in ASD and appear to depend on social interactions with family members and peers at school (Bora & Pantelis, 2016). Not surprisingly, social dysfunction is one of the most impactful aspects for the psychosocial development of children and adolescents with ADHD, as individuals with ADHD often report signifcant interpersonal problems, including confict with parents, siblings, peers, and teachers (Ros & Graziano, 2018). Social dysfunction in ADHD appears to depend on a number of factors, including social skills, ability to process information and modulate social responses (social cognition), and contexts of social interaction with peers (Ros & Graziano, 2018). Although these reviews seemingly converge on a view of defcits in social cognition in children and adolescents with ADHD, most clinicians seem to ignore that ADHD patients tend to experience not only limitations in social skills but also impairments in social information processing. Furthermore, ToM skills are associated with different patterns of prosocial behavior such as helping, cooperating, and comforting (Imuta et al., 2016), and since in ADHD ToM may be signifcantly impaired, these patients would be less likely to develop positive social interactions with their peers, teachers, and family members.

In impulse control disorders, social cognition is an aspect of patient functioning that has gained relevance. Impulse control disorders (ICDs) are grouped as a heterogeneous group of mental disorders related to the failure to resist impulses to perform dangerous, troublesome, or disturbing behaviors. We can include in this category, pathological gambling (PG), kleptomania, pyromania, trichotillomania, internet gaming disorder (IGD), intermittent explosive disorder (IED), among others. Aspects of the social cognition domain have been investigated in at least these last two clinical conditions (Coccaro et al., 2016). In IED, the biopsychosocial model of impulsive aggression holds that the individual with the disorder usually explodes in response to social threat, and one of the main dysfunctions would be in social information processing (Coccaro et al., 2011). Thus, the individual with IED, when encoding and interpreting social cues, often performs an attribution of hostile intent in social interactions (Coccaro et al., 2011). In a study investigating an aspect of social cognition of patients with IED, attribution style, demonstrated that participants with IED have higher hostile attribution in ambiguous social situations than healthy controls and patients with other psychiatric disorders, and also that attribution style was directly related to negative emotional response (Coccaro et al., 2016).

In the case of IGD, patients are observed to have signifcant social impairments, with a predilection for social interactions only in the online environment than in real life (Caplan, 2010). In the cognitive model of IGD proposed by Caplan (2010), pathological internet use could be defned by two cognitive features: preference for online social interaction and worry. Preference for online social interaction can be defned as an individual's tendency to develop beliefs that online interactions and relationships are better, safer, and more comfortable than face to face (Caplan, 2010). Worry or cognitive salience is defned as obsessive thought patterns about internet use (Caplan, 2010). Some studies have investigated that a component of social cognition, emotion recognition, is related to problematic internet use in adolescents (Spada & Marino, 2017; Yavuz et al., 2019). Specifcally, in relation to IGD, it is suggested that adolescents engage in gaming as a strategy to alleviate affective dysfunction caused by poor skill in recognizing negative emotions (Yavuz et al., 2019). Another study revealed that low negative emotion recognition skills were able to predict cognitive salience, tolerance, and relapse associated with IGD in adolescents (Aydın et al., 2020). These fndings analyzed together allow us to think that negative emotion recognition, attribution style, and consequently the strategies used for cognitive-affective regulation used by adolescents aroused by negative affect may increase their risk of developing impulse control mental disorders, specifcally IED and IGD.

#### **Treatment**

Behavioral interventions focusing on the stimulation of socio-emotional skills have been developed and their effectiveness tested in a more remedial or preventive perspective. For the follow-up of children and adolescents with a history of exposure to maltreatment, there are several approaches such as cognitive-behavioral therapy (CBT), Eye Movement Desensitization and Reprocessing (EMDR); therapies based on artistic activities or contact with animals, and family-focused interventions (e.g., systemic family therapy) (Macdonald et al., 2016).

In clinical conditions, such as autism, the most well-known interventions are those based on behavior analysis models. For children with high functioning autism, virtual reality-based training has proven to be a complementary strategy that can stimulate engagement in the therapeutic process (Didehbani et al., 2016). Simmons et al. (2019) proposed an integrative model for intervention programs considering social cognition and executive functions simultaneously. In fact, impairments in these cognitive domains have been associated with the symptomatology of both ASD and ADHD (Van der Meer et al., 2012). According to the model, for a target social and affective problem, intervention strategies should consider social cognitive (e.g. practices to recognition of emotional cues) or executive (behavioral rehearsal) components.

Other programs have a more preventive character, generally organized for implementation in schools. Emphasis on programs of this nature has been strengthened by evidence of the importance of developing socio-emotional skills, such as understanding one's point of view or self-regulation, for academic performance (Blair, 2002; Fantuzzo et al., 2007). An example is Social and Emotional Learning, or SEL (Durlak et al., 2011). The program aims to stimulate the management of emotions and the development of empathy in order to strengthen positive relationships and more responsible decision-making. In a meta-analysis, Durlak et al. (2011) reviewed 82 school-based SEL interventions implemented inside and outside the United States, covering 97,406 kindergarten to high-school students. The programs were selected according to recommended practice criteria including a clear alignment of goals and curriculum, involvement of students in all steps of implementation, attention to community needs, and refective activities such as class discussions. Results provided evidence of positive effects on students' attitudes toward themselves and toward school and learning, on social behavior, and academic performance, independently of students' race or socioeconomic background.

#### **Conclusion**

In this chapter, we describe how social cognitive processes, such as emotion recognition and ToM, develop. These processes are of paramount importance for the understanding of normal and pathological processes in childhood and adolescence. The continuous processing of social information and the perception of emotional states combine to build up a child's emotional knowledge and to develop social competence.

The process by which we understand our own emotions and those of others, and assign intentions and desires to others, helps to explain the different challenges we face throughout human development. The abilities to recognize emotions and assign mental states, important domains of social cognition, are crucial to assessing someone's immediate social environment, providing valuable information about the inner emotional state of others, and infuencing adaptive social behavior and social interactions. The observation of social and affective dysfunction in children and adolescents raises the question of whether these defcits are risk factors for neuropsychiatric disorders, the untoward consequences of neuropsychiatric disorders, or both. Socio-affective neuroscience offers ways to elucidate this issue. In addition, in health sciences, there has been a focus on the early identifcation of developmental conditions in which a fast and rational intervention can favorably reach positive outcomes. In most neuropsychiatric disorders, diffculties in social functioning are evident, and if we can focus on the identifcation and remediation of social and affective dysfunction at an earlier age, the better.

We address how these aspects are related to healthy development and also in neuropsychiatric disorders. Since the skills involved in social cognition are extremely important for future interactions, diligent investigations of this phenomenon are useful for clinical and applied neuroscience. Bearing in mind that the understanding of the social and affective factors involved in children's development is fundamental for the understanding of the adults that will emerge from it.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 11 Clinical Neuroscience Meets Second-Person Neuropsychiatry**

**Leonhard Schilbach and Juha M. Lahnakoski**

**Abstract** Disturbances of social and affective processes are at the core of psychiatric disorders. Together with genetic predisposing factors, deprivation of social contact and dysfunctional relationships during development are some of the most important contributors to psychiatric disorders over the lifetime, while some developmental disorders manifest as aberrant social behavior early in life. That the cause of mental illness is rooted in the brain was long held as a truism, yet fnding the causes for and neurobiological correlates of these conditions in the brain has proven and continues to be diffcult (Venkatasubramanian G, Keshavan MS, Ann Neurosci 23:3–5. https://doi.org/10.1159/000443549, 2016). In clinical practice, psychiatric disorders are diagnosed based on categorical manuals, such as the DSM and ICD, which form a useful guide for clinical diagnosis and interventions. Yet, understanding the specifc neural mechanisms leading to or characterizing distinct psychiatric conditions through this categorical approach has been slow (see, for example, Lynch CJ, Gunning FM, Liston C, Biol Psychiatry 88:83–94. https://doi.org/10.1016/j.biopsych.2020.01.012, 2020). Findings in the brain often do not seem to lend support to common mechanisms for the defned disorder categories. This is not particularly surprising because, in these diagnostic manuals, multiple combinations of symptoms can often lead to the same diagnosis, which is refected in highly variable phenotypes of psychiatric disorders.

**Keywords** Psychiatric disorders · Second-person neuroscience · Neuropsychiatry

#### J. M. Lahnakoski Forschungszentrum Jülich, Institute of Neurosciences and Medicine (INM), Jülich, Germany

L. Schilbach (\*)

Independent Max Planck Research Group for Social Neuroscience, Max Planck Institute of Psychiatry, Munich-Schwabing, Germany

LVR Klinikum Düsseldorf/Kliniken der Heinrich-Heine-Universität Düsseldorf, Düsseldorf, Germany

Ludwig-Maximilians-Universität, Medical Faculty, Munich, Germany e-mail: leonhard.schilbach@lvr.de

#### **Introduction**

Disturbances of social and affective processes are at the core of psychiatric disorders. Together with genetic predisposing factors, deprivation of social contact and dysfunctional relationships during development are some of the most important contributors to psychiatric disorders over the lifetime, while some developmental disorders manifest as aberrant social behavior early in life. That the cause of mental illness is rooted in the brain was long held as a truism, yet fnding the causes for and neurobiological correlates of these conditions in the brain has proven and continues to be diffcult (Venkatasubramanian & Keshavan, 2016).

In clinical practice, psychiatric disorders are diagnosed based on categorical manuals, such as the DSM and ICD, which form a useful guide for clinical diagnosis and interventions. Yet, understanding the specifc neural mechanisms leading to or characterizing distinct psychiatric conditions through this categorical approach has been slow (see, for example, Lynch et al. 2020). Findings in the brain often do not seem to lend support to common mechanisms for the defned disorder categories. This is not particularly surprising because, in these diagnostic manuals, multiple combinations of symptoms can often lead to the same diagnosis, which is refected in highly variable phenotypes of psychiatric disorders. Coupled with the complexity of the brain and its capacity for compensating regional disturbances through plastic changes makes it harder still to fnd causes and neural mechanisms of heterogeneous disorder labels. Moreover, evaluating the low-level contributors to psychiatric disorders is complicated as animal models for psychiatric conditions often cannot capture the complexity of these disorders in humans.

Recently, calls have been made for transdiagnostic approaches, such as the Research Domain Criteria (RDoC) framework (Insel et al., 2010), where mental illness is approached through specifc behavioral domains rather than lists of specifc symptoms and diagnostic labels. However, the majority of clinical research still relies on the categorization of patients, and even studies explicitly applying the RDoC framework to analyze neuroimaging data have mainly focused on fnding correlates for a limited subset of the domains (Carcone & Ruocco, 2017). To address the variability of disorder phenotypes and to delineate particularly relevant transdiagnostic domains, a focus on social and affective processes relevant for mental health and illness appears to be crucial for improving our understanding of the brain basis of psychiatric disorders in a way that maximally benefts the patients. This is motivated by the insight that social impairments are some of the most debilitating facets of psychiatric disorders and the conceptual consideration that ascriptions of psychopathology always make reference to intersubjective conventions, which has led to the construal of psychiatric disorders as "disorders of social interaction" (Schilbach, 2016).

In this chapter, we will outline major approaches of studying the brain basis of psychiatric disorders, mainly focusing on our own work and related functional magnetic resonance imaging (fMRI) studies while briefy considering evidence from structural and brain stimulation studies as well. Furthermore, we will discuss recent methodological developments inspired by the above-described focus on social

interaction, which has been described as a possible convergence of clinical neuroscience and psychiatry that could be described as the development of a secondperson neuropsychiatry. This development highlights the importance of quantitatively measuring behavioral characteristics of patients during real-life social interaction and moving toward studying active behavior in addition to passive processing of social and affective information (Lahnakoski et al., 2020; Schilbach, 2019). Finally, we will consider when and why it is important to measure also the brain function of two (or more) people during real-time interaction and how the quantifcation of behavior becomes even more important under such complex conditions.

#### **Structural Abnormalities and Functional Connectivity Correlates of Psychiatric Disorders**

If we follow the long-standing logic that psychiatric disorders are "disorders of the brain" (Insel & Cuthbert, 2015), it is reasonable to assume that these disorders are refected as physical and functional abnormalities in the brains of the affected individuals. In some cases, such direct links exist, as witnessed by specifc defciencies or behavioral alterations due to brain injury and lesions in specifc brain areas. Brain lesions can also lead to psychiatric symptoms in some cases, and, for example, mood disorders are often reported after traumatic brain injury. Yet, fndings on specifc, focal abnormalities appear to be inconclusive for most psychiatric conditions. One notable exception is a focal target which is a region in the subgenual cingulate cortex that, when stimulated intracranially, can lead to a reduction of symptoms at least in some patients suffering from treatment-resistant depression (Mayberg et al., 2005). This effect is likely not mediated only by changes in local activity, but in the way this region modulates activity in other brain regions through its connections.

In addition to studying focal differences, structural abnormalities of white matter bundles or functional hubs of the brain, i.e., brain regions with a high number of connections, can disturb the functional architecture of the brain. This is clearly visible in some neurological conditions, such as multiple sclerosis that affects the myelin sheath of neurons, thereby disturbing the electrical conduction of signals between brain regions. Functional connectivity, usually measured through temporal correlations of hemodynamic activity with fMRI, is thought to refect the organization of brain connections and their functional integration. Repeatable patterns of connectivity have been produced in a multitude of studies refecting plausible functional networks. Often, this connectivity is studied in the absence of a task, with the (implicit or explicit) assumption that the connectivity refects relatively stable properties underlying anatomy and physiology. Indeed, the effects of task-induced activity on the functional connectivity patterns are reasonably subtle compared with the large-scale network structure (Gratton et al., 2018; Simony et al., 2016). Some evidence exists that reliable group-level differences exist in psychiatric disorders, such as autism spectrum disorder (Holiga et al., 2019) across multiple studied populations. Yet, the variability between individuals in local connectivity measures tends

to be high both in patients and in control populations highlighting the diffculty of fnding common neural underpinnings for these disorders. Moreover, the temporal fuctuations of connectivity have recently gained more interest leading to an ongoing debate on whether state transitions and meta-states of connectivity at shorter timescale are a reliable or a more sensitive predictor of psychopathology than time averaged connectivity or, alternatively, an artifactual property of the analyses methods on slow and noisy signals.

Recently, other approaches looking at more global network or subnetwork properties, rather than local differences, have been gaining more attention. For example, differences in the subnetwork structure of functional networks including limbic regions have been reported that seem reliable across samples both during resting state and movie viewing paradigms (Glerean et al., 2016). However, the implications of these fndings at the level of an individual remain unclear. Most network analyses rely critically on thresholding of the connectivity matrices (Garrison et al., 2015), and network properties can change considerably by a small change in the selected threshold. This can be alleviated, for example, by using relative thresholds (Garrison et al., 2015) or considering different ranges of connectivity values separately rather than setting a single threshold (Bassett et al., 2012), which can help in detecting connectivity patterns that are predictive of psychopathology.

One recent development has combined lesion studies with connectivity measures, where functional connectivity in patient groups sharing similar symptoms yet having distinct focal brain lesions suggests the connectivity of the lesioned areas may be particularly important to determine the functional consequences for the patients. For example, two aspects of "free will," volition and agency, appear to be differentially affected depending on the connectivity of the lesion site (Darby et al., 2018), with the former being associated with lesions in regions that connect to the anterior cingulate cortex and the latter with regions connecting to the precuneus. These results suggest that aberrant structure or function of different sets of brain regions may potentially have common effects through their connections in a region that is not directly affected by the lesion. However, whether these fndings prove helpful for patients suffering from psychiatric disorders remains unclear.

Importantly, it seems that differences in functional brain networks, compared to a healthy population, are highly overlapping between multiple psychiatric disorders. Indeed, rather than being disorder-specifc, measures of general level of psychopathology, the so-called p factor (Caspi et al., 2014), can often explain much of the neuroimaging fndings. It has been argued that there may be a common underlying contributor that predisposes individuals to developing a range of psychiatric disorders, which may also be refected in the overlap of genetic fndings across psychiatric disorders, which is supported by recent fndings of shared neurobiological and cellular mechanisms of at least six different psychiatric disorders reported by the relevant working groups of the Enigma project (Patel et al., 2020). Controlling for these disorder-general correlates of psychiatric disorders may help in pinpointing the disorder-specifc mechanisms. However, if the brain is studied through static anatomical and connectivity properties without any behavioral readouts beyond a categorical label, understanding the signifcance of these fndings to the social life and general well-being of patients is not straight forward.

#### **Stimulation-, Task-, and Model-Based Studies**

Most of social cognitive neuroscience, particularly neuroimaging studies, have focused on simplifed stimulus- and task-based designs. The goal here is to isolate and systematically manipulate particular constituent features or task components that together could enable more complex tasks to be performed. In a clinical context, one might then compare how strongly particular brain regions are activated by a given task across different diagnostic groups or if the activity level is correlated with certain symptom dimensions.

This approach has clear benefts for the interpretation of potential group differences because the observed brain activity can be linked to specifc cognitive functions, in particular when mathematical modeling allows to predict brain activity change, which can be taken to suggest that the brain realizes similar computations to generate and control behavior. One example of this approach is a suite of recent studies by Henco and colleagues (2020b, b), in which they investigate the effect of implicit social cues (e.g., gaze shifts of a face) to bias decision-making in a probabilistic learning task, even though study participants were not asked to the social cues into account. Intriguingly, the way these social cues affect decision-making appears to be mechanistically different in individuals with borderline personality disorder and schizophrenia compared to both healthy controls and patients with major depressive disorder (Henco et al., In press), suggesting that the study of implicit social processes in combination with computational modeling might be particularly helpful in elucidating the neural mechanisms that differentiate these disorders.

Importantly, these kinds of experimental task use a fxed reward and learning schedule, which offers high levels of experimental control, and lend themselves to data analytic approaches that use mathematical models to describe cognitive and putatively neural mechanisms that underlie participants' behavior. Using hierarchical Gaussian flter models, Sevgi and colleagues (Sevgi et al., 2020) demonstrated how participants integrate social and nonsocial information to come up with their decisions and how this differs as a function of interindividual variance of autistic traits. The parameters derived from computational modeling can also be used to inform neuroimaging analysis, which has become known as model-based fMRI: here, it can, for instance, be assessed whether trial-to-trial changes of modeling parameters are related to brain activity changes. Using this approach, Henco et al. (2020a, b) demonstrated that interindividual differences in social belief computations, i.e., whether participants tend to use social cues during decision-making, even when not explicitly instructed to do so, were related to brain activity levels in the putamen and insula, areas that have previously been associated with habitual behaviors and interoception.

#### **Naturalistic Passive Observation**

While the conventional approaches described above have allowed us to gain completely new insights into relevant brain processes, many of them tend to rely on the assumption of "pure insertion," at least approximately (Friston et al., 1996). That is, it is assumed that effects of the manipulation of individual features or processes are essentially independent of each other and, in more complex or naturalistic conditions, these effects sum up to produce more complex processes or behaviors. In some cases, fndings of simplifed experiments generalize to more natural conditions, at least to some extent. For example, contrast edges of video images correlate with activity in the early visual cortex (Lahnakoski et al., 2012a), as might be expected based on the properties of edge-detecting cells in the region, but the amount of variance explained is relatively low. Careful consideration of a range of stimulus features can reveal insight into the organization of the brain networks of naturalistic social observation, for example, highlighting regions such as the posterior superior temporal sulcus and surrounding temporoparietal regions as potentially key regions for integrating multiple types of socially relevant information (Lahnakoski et al., 2012b) as well as building coherent temporal sequences of related events (Lahnakoski et al., 2017). The amplitudes of responses to emotionally arousing events in these regions appear to be also related to individual differences of the endogenous opioid system (Karjalainen et al., 2019), which may prove helpful for assessing potentially aberrant neurotransmitter function in psychopathology. Importantly, however, it is less clear how complex intuitive social processes can be deconstructed into more basic constituents. Arguably, more naturalistic social processes are only observable in complex situations, and the underlying processes may not be directly accessible through the stimulus properties, event descriptions, or even simple dimensional models of emotion alone. For example, recent fndings have shown that when participants share a point of view toward movie events, either experimentally (Lahnakoski et al., 2014) or through friendship in everyday life (Parkinson et al., 2018), the similarity of the brain activity is increased compared with individuals who do not share a perspective or do not know each other. Such similarity between friends appears not to be refected in functional connectivity during rest (McNabb et al., 2020), although more sensitive measures may yet reveal such associations. Moreover, naturalistic stimulation may provide benefts for detecting aberrant brain activity related to psychiatric disorders (Eickhoff et al., 2020), and prediction of behavioral traits may prove to be more successful using connectivity measures derived from, particularly social, natural viewing paradigms rather than resting state data (Finn & Bandettini, 2020). Thus, more ecologically valid dynamic stimulation may not only highlight brain-behavior associations but also highlight the types of content that best reveal these associations to guide us to further our understanding of naturalistic brain processes beyond simple models of stimulus features or general emotion dimensions (see, for example, Finn et al. 2020).

However, despite this potential beneft in highlighting individual differences, the use of naturalistic stimuli in the study of psychiatric disorders is still relatively rare. Some of the earliest studies have shown that, for example, individuals with ASD

tend to show idiosyncratic patterns of both eye gaze and brain activity during natural viewing conditions (Hasson et al., 2009; Salmi et al., 2013). This highlights a potential diffculty in understanding the brain mechanisms underlying psychiatric disorders mentioned earlier; if patients with the same diagnosis are highly variable, then group contrasts, and predictions are likely to fail. It, thus, appears particularly important to further characterize the participants' behavior and experiences, as well as the contents of the stimuli that are particularly relevant for detecting the disorders. For example, during movie viewing, aberrant brain activity related to positive symptoms of frst-episode psychosis patients appears to be particularly observable during surreal, fantasy scenes, which may share aspects of the patients' symptoms (Rikandi et al., 2017). Further work is required to discover the limits of passive observation studies and to what extent specifc neural functions can be studied in complex conditions, with more limited experimental control. Likely, a fruitful approach is to iteratively alternate between more exploratory fndings in naturalistic experiments, working backward toward more controlled conditions to design experiments to test specifc hypotheses on the low-level mechanisms of psychiatric disorders, and testing the mechanistic predictions again in more naturalistic conditions, potentially in interactive tasks mimicking real-life situations where the presumed mechanism is particularly important (cf. Schilbach, 2019).

#### **Interactive Experiments, Second-Person Neuroscience, and Neuropsychiatry**

While investigating more naturalistic social situations is benefcial to understand complex social cognition, it has been pointed out that a fundamental difference may exist between situations of social observation, i.e., social cognition from an observer's point of view, as compared to situations of social interaction, i.e., social cognition from an interactor's point of view (Schilbach, 2014, 2016; Schilbach et al., 2013). Contrary to the conventional stimulus-response paradigms described above, social interactions are characterized by behavioral reciprocity. That is, social perception leads to actions that, in turn, will be responded to by the interaction partner (and so forth). In order to investigate how these social contingencies and the ensuing dynamics of social interaction modulate brain activity, we, therefore, need truly interactive tasks, which allow for the participant to engage in such reciprocal social interactions. Following the call for a truly social or second-person neuroscience, recent years have seen a growing number of studies that have focused on core social-interactive behaviors, such as studies in which participants perceive communicative cues to engage them in interaction (e.g., direct gaze) all the way to studies that include reciprocal, face-to-face interactions with a social partner (real or perceived; see Redcay and Schilbach (2019) for a recent review). In addition to increasing the ecological validity of the task used and making the social encounters more lifelike and dynamic, for example, using real video recordings in place of computergenerated avatars (Brandi et al., 2019), second-person neuroscience has also focused on scanning interacting brains, which has been described as hyperscanning (e.g., Bilek et al. 2015; Dumas et al. 2010). Findings from these studies have helped to gain striking new insights into the workings of "social brains," which, indeed, indicate that the neural mechanisms supporting social interaction do, in fact, differ from those during social observation. Findings converge on a set of brain regions and large-scale neural networks that appear to play key roles and interact in intricate ways in order to support social behavior during social interaction. In addition, the use of two-person experiments and hyperscanning techniques allows us to take a completely new look at how social behavior is realized across persons and brains and to investigate phenomena such as interpersonal synchrony, mimicry, and other forms of alignment in more ecologically valid contexts (Bolis et al., 2017; Schilbach, 2015). These developments constitute important steps in the advancement of social neuroscience and will continue to provide new insights into how activity in largescale neural networks is modulated by social interactions and also open up new avenues for future research.

In addition to this, a second-person neuroscience may also be relevant for neuroimaging research in the feld of psychiatry and could, therefore, contribute to what might be called a second-person neuropsychiatry (Schilbach, 2016): Here, it has been increasingly recognized that it is social interaction rather than passive observation that is often most diffcult for patients suffering from psychiatric disorders. For example, an individual may well understand an emotion depicted in a movie as the conventions that have been developed by the artists working in the movie industry are highly effcient in conveying emotions, whereas in real life emotional cues may be much subtler. Moreover, in real-time interactions, there is little time for explicit interpretations of the socio-emotional states of the interaction partner but rather relies on a practical "know-how" of how to deal with them. In other words, people often automatically understand, empathize with, and predict the words or actions of their partner enabling them to act appropriately without explicit reasoning. This has been demonstrated by a study by von der Lühe and colleagues (von der Lühe et al., 2016), in which it was shown that patients with high-functioning autism are able to recognize and explicitly label actions even when they are depicted by impoverished point-light displays but fail to use this information to predict the subsequent action of a potential interaction partner. In other words, it was only the complexity of a dyadic social interaction situation that brought about autism-specifc defcits in predicting subsequent actions rather than diffculties in action perception, which was found to be intact. Following this lead, it appears important to introduce new methods and techniques that help us to quantitatively assess behavior during real-life social interactions as this may help to understand how social interaction diffculties might be related to alterations of cross-brain rather than single-brain network activity (Bilek et al., 2017; Bolis & Schilbach, 2018).

#### **Behavioral Characterization of Psychiatric Disorders in Individuals, Dyads, and Social Networks in Everyday Life**

Studying constrained social interactions in the laboratory has clear benefts for interpretability compared with trying to measure interactions "in the wild," much like controlled task designs in neuroimaging studies often allow for more straightforward modeling and interpretation of results than more naturalistic experiments. Yet, constrained experiments can be rather poor approximations of real-life social behavior. Moreover, our initial systematic measures of behavior during dyadic interaction suggest that some behavioral characteristics of individuals may only manifest when they can interact freely, with minimal experimental constraints (Lahnakoski et al., 2020). Thus, enabling the systematic, quantifable measurement of social behavior in natural interactions, i.e., interaction-based phenotyping, in the clinic as well as in the everyday life of patients may be crucial for understanding the individual as well as shared symptoms of psychiatric disorders (Schilbach, 2019). This type of extensive characterization of psychiatric disorders at the level of individual patients may be the key to disentangling general brain correlates of psychopathology from disorder- and symptom-specifc brain mechanisms. Moreover, it may be the key to fnally move toward individualized interventions in psychiatry, which to a large extent are still lacking.

Interestingly, behavioral measures, such as interindividual synchrony and mimicry, distance, gaze, and orienting of the face and the body, have been shown to be predictive of the subjective quality of interactions (Lahnakoski et al., 2020). Also, using measures of motion energy in videos between patients and their therapist, behavioral synchrony has shown promise in predicting short- and long-term therapeutic success for patients with schizophrenia (Ramseyer & Tschacher, 2014). It may also be possible to differentiate between patients with autism spectrum disorder (ASD) from control participants based on their behavioral synchrony with an interaction partner (Georgescu et al., 2019), although further work is needed to evaluate the practical applicability of these preliminary fndings in larger cohorts. Importantly, however, using such simple measures of synchrony of motion lack specifcity of what the people are doing during the interaction. Moreover, synchrony does not appear to always be useful for detecting differences in subjective interaction quality. In the study mentioned above (Lahnakoski et al., 2020), we showed that measures like distance and facial orienting behavior may be more indicative of the subjective enjoyment and effort invested into interactions, respectively. Moreover, these may be differently predictive in different conditions, so a single measure may not ft all questions.

While such systematic and quantitative characterizations of dyadic social interactions appear to be a fruitful avenue for evaluating, for example, dyadic behavior during interactions with a therapist in the clinic, the majority of social interaction problems manifest in everyday life. Anecdotally at least, patients may feel fne at the clinic and have severe relapses of symptoms after they are discharged and have to continue their daily lives. Thus, to get a picture of the causes of the daily diffculties patients face, beyond subjective evaluations, quantitative measurements should be extended to daily life of individuals. The recent widespread introduction of personal digital devices, such as smartphones, led to the development of digital phenotyping (Onnela & Rauch, 2016), where such devices, potentially complemented by, for example, wearable sensors, can be used to continuously measure the behavior of individuals in their everyday life. This approach can produce a wealth of data for detecting various social and behavioral characteristics of illnesses (Torous et al., 2016), which can be of great beneft for fnding behavioral markers that may guide therapy and further scientifc inquiry. Yet, much work is still required to detect consistent, meaningful patterns in this type of data. Moreover, pattern detections and behavioral prediction that are not informed by strong theoretical foundation cannot

substitute a mechanistic understanding of the disorders. On an optimistic note, the use of interaction-based phenotyping and other forms of digital phenotyping "in the wild" may help to investigate the social behavior and factors that are relevant and constitutive of psychiatric disorders. As the relevant classifcations used in psychiatry today rely on intersubjective conventions of what should be considered as a nosological entity, the use of quantitative, data-driven approaches that integrate information about social, psychological, and biological factors may help to delineate disorder-general and disorder-specifc profles for what we take as separate disorders today. In addition, a major challenge for the future also lies in the defnition of mechanistic models of psychiatric disorders that are grounded in the underlying neurophysiology and are able to make predictions of outcomes of specifc disturbances of the system and interventions that alleviate such disturbances. So far, the existing models have yet to prove their usefulness in the larger scale. However, initial mechanistic insight into potential contributors to disordered social processing has started to shed light on underlying psychological mechanisms of psychiatric disorders. In the two studies mentioned earlier (Henco et al., 2020a, b, In press), we used a hierarchical learning models to demonstrate that not only do patients with schizophrenia and borderline personality disorder score lower in probabilistic learning task in the presence of implicit social cues but also expanded to the mechanisms of excessive weighting of social information during periods of uncertainty. Similar learning models can be used for various types of interactions. However, modeling unconstrained real-life interactions is a signifcant challenge for future research. Thus, a thorough exploration and systematic characterization of interactions seem crucial for guiding modeling efforts of social interaction disorders and eventually linking them to their underlying causes in the mind, brain, and body.

Eventually, to fully understand and empirically test the brain mechanisms of reciprocal social interactions, we will also need to not only correlate behavior with subsequent brain measures but also be able to measure the brains of two (or more) interacting individuals at the same time to directly link brain activity and behavior. Such hyperscanning studies have been slowly gaining momentum, as briefy described above. In this context, mobile electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS) offer the beneft of much reduced constraints on behavior compared to fMRI, although simultaneous fMRI experiments have been performed for some time, either by linking two separate MRI devices (Montague et al., 2002) or with specially designed head coils within one scanner (Renvall et al., 2020). However, it is important to consider when it is necessary to measure multiple people at the same time and when it is suffcient to measure, for example, only one person during an interaction with another person outside of the scanner (cf. Redcay & Schilbach, 2019). Alternatively, brain imaging can be performed sequentially, by frst measuring the brain activity and audio or video recording, e.g., a person telling a story (Smirnov et al., 2019) or performing hand actions (Smirnov et al., 2017), followed by a measurement of participants listening or viewing the recording. Hyperscanning studies are complex to run and analyze, and, thus, it may be counterproductive to design such studies when a simpler experimental design would suffce. Moreover, when people are measured while they participate in an interaction, it is particularly important to know how they are behaving (Hamilton, 2020) as no exact schedule for events during the interaction can be enforced. For example, neural synchrony, which to some extent appears to be associated with sharing "the social world," or state of mind with other people (Nummenmaa et al., 2018) may, during an interaction, also arise trivially when the interactants just look at the same stimulus at the same time. Thus, during an interaction, synchrony may arise in a similar manner as in the passive observation studies described above without any deeper sharing of mental states. For the latter, activity in the so-called mentalizing network of the brain has been implicated, in particular in situations of direct social interaction (Redcay & Schilbach, 2019). Moreover, because every interaction is different, direct comparisons based only on the brain activity of people are diffcult to interpret without characterizing the interaction. Thus, a combination of detailed behavioral characterization and brain-based measures is crucial for a more complete understanding of the neural underpinnings of natural social interactions and disorders thereof in psychopathology.

#### **Conclusions**

In the past, reliance on heterogeneous disorder categories and an overemphasis on the brain have potentially limited the progress of our understanding of the behavioral and neural mechanisms of psychiatric disorders. Moreover, common predisposing mechanisms appear to be shared by multiple disorders, which can lead to nonspecifc fndings between disorders, and more specifc measures of the disorders are required. While subjective mental suffering of patients is not directly accessible to researchers or therapists, disordered social interactions are some of the most severe symptoms of many psychiatric disorders that are, at least in part, detectable and measurable by an external observer. Differences in social behavior are often intuitively used by therapists while diagnosing and interacting with patients, yet rarely are these behavioral abnormalities systematically measured. By carefully characterizing individual behavioral manifestations of the disorders between patients, particularly in social interactions and everyday life, we may better understand the complex disorder phenotypes and their underlying mechanisms and, ultimately, move closer to individualized interventions.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Part IV Methods Used in Social and Affective Neuroscience**

## **Chapter 12 EEG and ERPs in the Study of Language and Social Knowledge**

**Alice Mado Proverbio**

**Abstract** Event-related potentials (ERPs) represent the ideal methodological approach for investigating the time course of language reading and comprehension processes. In this chapter, various ERP components refecting orthographic, phonological, semantic, and syntactic processing of written and auditory language are examined. Furthermore, data are shown of how ERPs can refect stereotypes, prejudices and world knowledge, including people's social traits and attributes. In particular, several recent neuroimaging and electrophysiological studies are presented investigating the neural underpinnings of ethnic and sex biases (both in male and female individuals).

**Keywords** EEG and ERPs · Electrophysiology of language · Orthographic analysis · Stereotypes and prejudices

#### **Introduction: EEG and ERP Signals of the Brain**

The electromagnetic activity of the brain essentially translates into (i) electric felds/ potentials and oscillatory magnetic felds, which constitute the electroencephalogram and the magnetoencephalogram, and (ii) variations of the electric and magnetic felds caused by nerve impulses induced by external or mental stimuli/events, which result in event-related potentials (ERPs) and event-related felds (ERF), respectively (for details, see the handbook by Zani and Proverbio (2003)).

The rhythmic EEG oscillations originate in the cortex, but their pacemaker is subcortical and is located in the thalamic nuclei. The electrical potentials of the brain can be detected on the scalp surface through the application of metallic sensors named electrodes, while the magnetic felds are measured by sensitive MEG gradiometers. The potential changes recorded at the scalp derive from the sum of both excitatory and inhibitory postsynaptic potentials of neurons whose apical

A. M. Proverbio (\*)

Department of Psychology, University of Milano-Bicocca, Milan, Italy e-mail: mado.proverbio@unimib.it

dendrites are oriented perpendicular to the cortical surface (e.g., pyramidal cells or hyper-columns of the visual cortex).

In general, the typical EEG rhythms of waking state in the adult person have a fairly rapid oscillation frequency, which varies between 8 and 25 Hz (alpha and beta rhythms), while in the sleepiness and sleep states, the EEG rhythm progressively decreases reaching 1 Hz of frequency in the *slow-wave sleep* (SWS), known as delta rhythm.

ERPs consist in electric potential oscillations that occur in the brain of an individual in response to a stimulus administered in one of the different sensory modalities ("exogenous" potentials) or in relation to higher cognitive functions such as attention, motivation, emotions, and expectations ("endogenous" potentials; see Zani and Proverbio (2003)). ERPs, in whatever modality they are recorded (visual, acoustic, or somatosensory), appear as waveforms characterized by a series of positive and negative defections whose polarity is marked by P and N letters of the alphabet and accompanied by increasing numbers indicating the temporal progression of appearance (latency in ms). Each of the ERP components can be considered as the manifestation of neural activity associated with specifc stages of information transmission and processing within the brain.

#### **Electrophysiology of Language**

Figure 12.1 shows the time course of linguistic information processing based on data derived from the event-related potentials (ERPs) recording technique. ERPs represent a unique tool in the study and analysis of different stages of linguistic information processing, since they are characterized, on the one hand, by the lack of invasiveness typical of electroencephalographic (EEG) recording and, on the other hand, by an optimal temporal resolution (which may be <1 ms).

The temporal latency of a given defection or peak (positive or negative voltage shift), visible in the waveform of the ERPs, therefore represents the occurrence of brain processing activity time-locked to a cognitive event (Zani & Proverbio, 2003). For example, the occurrence of a voltage defection at about 70–80 ms at the scalp sites over the primary visual cortex refects the arrival of incoming information to the visual cortex and the corresponding activations of neural populations involved in visual information sensory processing. In the same way, the occurrence of a large negative defection at about 400 ms in response to semantically incomprehensible stimuli refects semantic meaning analysis processes for a given word. ERPs recording in the study of language comprehension mechanisms were applied for the frst time at the end of the 1970s by researchers working in the feld of what has since become known as cognitive neuroscience. In 1968, Sutton discovered that the human brain elicited a large positive response to those stimuli that were selectively attended at a particular moment (identical in terms of physical characteristics to those disregarded). This implied that it was possible to study mental processes by observing their neurophysiological manifestations.

**Fig. 12.1** Time course of cerebral activation during the processing of linguistic material as refected by the latency of occurrence of various ERP components. Prelinguistic stimulus sensory processing occurs (P1 component) at about 100 ms poststimulus; orthographic analysis of written words (posterior N1 component) at 150–200 ms; phonologic/phonetic analysis at 200–300 ms, as revealed by phonological mismatch negativity (temporal and anterior pMMN) seen in response to phonologic incongruities (both visual and auditory); and a large centroparietal negativity at about 400 ms (N400), recorded in response to semantic incongruities and indexing lexical access mechanisms. The comprehension of meaningful sentences reaches consciousness between 300 and 500 ms (P300 component); fnally, a second-order syntactic analysis is indexed by the appearance of a late positive defection (P600) at about 600 ms poststimulus latency

To study language, Marta Kutas developed two different experimental paradigms. In the frst, *rapid serial visual presentation* (RSVP), single words are consecutively presented in the center of a screen (Kutas, 1987) in order to simulate the process involved in the spontaneous reading of a sentence and to monitor the time course of semantic and syntactic comprehension processes while avoiding the horizontal ocular movements that normally go along with text reading. The second, quite popular, paradigm is called the *fnal word paradigm* (Kutas & Hillyard, 1980), and it is based on the presentation of a semantic or syntactic context of variable nature and complexity that is followed by a given terminal and critical word, to which brain potential is time-locked and which can be more or less congruent with the context or respectful of various word concatenation rules of a given language.

#### **Orthographic Analysis**

ERPs represent a quite useful tool for investigating reading mechanisms in that they provide several indices of what is occurring, millisecond by millisecond, in the brain, starting from stimulus onset: from the analysis of sensory visual characteristics (e.g., curved or straight lines, angles, circles, etc.) to orthographic analysis (letter recognition), to the analysis of complex patterns (words), and to their orthographic aspect (which, for example, greatly differs for the German, English, or Finnish languages) and their meaning.

Numerous ERPs and magnetoencephalography (MEG) studies (e.g., Bentin et al., 1999; Helenius et al., 1999; Proverbio et al., 2002, 2004) have provided clear evidence that the occipitotemporal N170 response (with a mean latency of 150–200 ms) specifcally refects stimulus orthographic analysis (Fig. 12.2). For example, Helenius et al. (1999) recorded MEG signals in dyslexic and control adult individuals engaged in the silent reading of words (either clearly visible or degraded with Gaussian noise) vs. symbolic strings. The results showed that while the frst sensory response (100 ms of latency) associated with sensory processing did not differ in amplitude across groups, N170 component sensitive to orthographic factors, usually focused on the left inferior occipitotemporal cortex (i.e., over the *visual word form area*), was not lateralized and was considerably reduced in amplitude in dyslexic individuals.

**Fig. 12.2** Visual perception of words activates the left occipitotemporal cortex at about 170 ms poststimulus. This response is much larger to words than non-orthographic strings. Lexical processing reaches its peak at about 400 ms

In a recent study (Proverbio et al., 2007), we compared ERPs evoked by words and pseudo-words in their canonical orientation with those elicited by words and pseudo-words fipped horizontally. The aim was to assess whether the inversion of words deprived them of their linguistic properties, thus making them nonlinguistic stimuli. About 1300 Italian words and legal pseudo-words were presented to 18 right-handed Italian students engaged in a letter detection task. In order to identify the temporal latency of alphabetic letter processing and recognition, ERPs evoked by target and nontarget stimuli were compared. ERPs showed an early effect of word orientation at ~150 ms, with larger N1 amplitudes to rotated than to standard words. *Low-resolution brain electromagnetic tomography* (LORETA) localized this increase in N1 to fipped horizontally words primarily in the extrastriate cortex of the right occipital lobe (BA 18), which may indicate an effect of stimulus novelty. N1 was greater to target than to nontarget letters at left lateral occipital sites, thus refecting the frst stage of orthographic processing. LORETA revealed a strong focus of activation for this effect in the left fusiform gyrus (BA 37), which is consistent with the so-called visual word form area, corresponding to the left inferior occipitotemporal cortex.

#### **Lexical Analysis**

After accessing phonologic properties of words during reading, the brain is able to extract their semantic/lexical properties at about 300–400 ms of poststimulus latency (Federmeier et al., 2002), as indexed by N400 component. The amplitude of this component is generally greater over the right centroparietal areas at the scalp, but this does not correspond to the inner anatomical localization of semantic specialized areas. Intracranial recording studies have shown that N400 generators lie in the left temporal cortex, near the collateral sulcus and the anterior fusiform gyrus. In her original paper, Kutas (1980) used the *fnal word paradigm* to investigate the functional properties of N400 response and distinguished between the concepts of "semantic incongruence" and "subjective expectation" of sentence termination. Thus, Kutas postulated the existence of a *contextual constraint* generated by the overall semantic meaning of a sentence, which in itself would not be suffcient to explain the increased N400 effects. In order to illustrate this, let's take, for example, the following sentence: "She put sugar and lemon in her." The overall sentence meaning binds the terminal word to be a sort of drink and especially TEA. In the sentence "She put sugar and lemon in her BOOT," the fnal word elicits an N400 of noticeable magnitude because the contextual constraint has been macroscopically violated. However, in the sentence "She put sugar and lemon in her COFFEE," the fnal word still generates an N400 response but of lower amplitude as compared to the incongruent BOOT word. The negativity is generated because COFFE is semantically less related to sugar and lemon than the word TEA, but N400 to COFFEE is smaller than N400 to BOOT because the former belongs to the same semantic domain of TEA (drinks). This fnding refects the effect of the *contextual constraint*. However, the contextual constraint alone cannot predict the whole process of semantic comprehension in reading.

Kutas also introduced the *cloze (closure) probability* factor, meant as the probability that a group of speakers might complete a certain sentence with a specifc fnal word; this effect does not completely correspond to the contextual constraint, because it refers to subjective expectancy and not to semantic relatedness. For instance, in the sentence "He sent a letter without a STAMP," the fnal word is highly predictable, for many speakers, because it has a high cloze probability. Conversely, the fnal word of "There was anything wrong with the FREEZER" has a low cloze probability since, from a statistical point of view, not many speakers would complete this sentence in the same way. Therefore, while being not semantically incongruent with the context (therefore, not violating the semantic constraint), FREEZER would still elicit an N400 much larger than the word STAMP from the previous sentence, because it is completely unexpected and unpredictable for the speakers. In this vein, N400 paradigm might be advantageously used to investigate conceptual representation of social attributes in different groups of speakers, including stereotypes and prejudices (race-based, gender-based, etc.).

The N400 is a hallmark of the semantic integration mechanisms, and, as such, it is sensitive to the diffculty with which the reader/listener integrates the incoming sensory input with the previous context, based on the individual expectations. Although the maximum response peak to incompatible, unexpected, or low cloze probability words is reached around 400 ms, earlier ERP responses have also been reported to be sensitive to some lexical properties of words, such as their frequency of use. King and Kutas (1998) described an anterior negative component the *lexical processing negativity* (LPN), with a latency of about 280–340 ms, which seems very sensitive to the frequency of word occurrence.

In an ERP investigation (Proverbio et al., 2004) in which words, pseudo-words, and letter strings were presented during a phonetic decision task, the earliest effect of a lexical discrimination between words and pseudo-words was observed at about 200 ms poststimulus (Fig. 12.3). It would seem, then, that neural mechanisms of access to the lexical features of linguistic stimuli activate in parallel with the extraction of their orthographic and phonologic properties. Some studies also demonstrated a sensitivity to lexical properties of short, familiar words at latencies even earlier than 150 ms (e.g., Assadollahi & Pulvermüller, 2003; Pulvermüller et al., 2001). Finally, Proverbio et al. (2008) have shown that orthographic N170 response, generated within the left fusiform gyrus, manifested a sensitivity to sub-lexical properties (word frequency of use), being of greater amplitude in response to highthan low-frequency words.

#### **Pragmatic Analysis**

Just as the P300 represents an index at the scalp of neural mechanisms of *contextual updating*, that is, the updating of personal knowledge as a consequence of comparing the ongoing stimulus input with the information retained in long-term memory

**Fig. 12.3** ERP waveforms recorded in response to words, pseudo-words, and letter strings during a phonetic decision task (e.g., "Is phone/k/present in oranges?"). The frst lexical effect was found at P2 level. (Adapted from Proverbio et al. (2004), with the permission of MIT Press)

(Donchin, 1987), conversely, an increased N400 represents a diffculty in integrating incoming inputs with previous knowledge, including world knowledge of pragmatic nature (scenarios, such as how to pay the bus ticket), and social knowledge (social conventions, cultural habits, rules about what is appropriate or not, etc.).

For example, let's frst consider the classical case of a violation of the semantic constraint such as the one provided by the sentence "Jane told her brother that he was very...," followed by three possible terminal words:


As extensively dealt with in the previous section, case C gives rise to N400 (depicted in Fig. 12.4, top) since "rainy-ness" is not a possible property of a person, which therefore makes it hard to integrate the meaning of the terminal word with the conceptual representation elicited by the aforementioned sentence. Since this incongruence is verifable per se, independent of the context or the specifc speakers, it is defned as a violation of the semantic constraint (which is to be distinguished by the cloze probability).

However, Hagoorth and coworkers (e. g., Van Berkum et al., 1999) discovered that the N400 is also sensitive to violations of the sentence meaning mediated by the context or by social knowledge. Let's consider, for instance, the sentence "At 7:00 a.m., Jane's brother had already taken a shower and had dressed too," followed by "Jane told her brother that he was incredibly...," completed by the terminal words:

A. FAST Congruent

B. SLOW Incongruent

Case B would give rise to a large N400 (depicted in Fig. 12.4, middle) since the conceptual representation of a fast and early-rising brother induced by the previous context is in striking contrast with the way his sister defnes him. The semantic incongruence can be extended to implicit or pragmatic knowledge, such as social knowledge. Let's take, for instance, the sentence "On Sunday, I usually go to the park with..." pronounced by the voice of (1) a child and (2) an adult man and followed by two possible terminal words:

#### A. MY DADDY

#### B. MY WIFE

Final words 1B and 2A would elicit a wide N400 defection (Fig. 12.4, bottom) in the absence of any violation of the semantic constraint or of the contextual constraint, thus indexing a pure violation of pragmatic and/or social knowledge. Indeed the adult male voice would not predict a "daddy" fnal, so as the childish voice would not predict a "wife" fnal. These predictions are not based on semantics but on our social knowledge, according to which a child is not usually married and an adult male does not typically use a "sugary" language.

Another study by Hagoort et al. (2004) provided a very interesting parallelism between violation of the semantic constraint and violation of the world knowledge. A typical example of *world knowledge* could be the direction in which that doors open (almost always inward but outward in case of anti-panic doors), a knowledge that is implicitly learned by means of repeated experience with the external world. Hagoort comparatively presented three types of sentences:


In their study sentences, B (i.e., a violation of the world knowledge) and C (i.e., a semantic violation) elicited N400s of similar amplitudes and topographic distributions in Dutch participants, although these violations were extremely different in type. Everyone knows that a train cannot be acidic (semantic knowledge). Similarly, a Dutch person who has traveled by subway or railway would defnitely be aware of the fact that the trains of their town and country are not white.

The diffculty in integrating the incoming information provided by sentences B and C with previous knowledge would stimulate cognitive processes observable at about 400 ms after critical word onset, in the form of an enhanced N400 response.

#### **ERP Indices of Stereotypes and Prejudices**

The N400 has also been found to be affected by personal semantics (Coronel & Federmeier, 2016), that is, by violations relative to subjective knowledge (i.e., personal preferences such as likes and dislikes) across a wide range of topics (including foods, sport teams, music, flms, etc.).

A few studies have used the N400 response to investigate the neural representation of stereotypes (Bartholow et al., 2001, 2003). Osterhout and coauthors (1997) showed participant sentences referring to stereotypically male or female occupations and pronouns that did or did not match the gender stereotypically implied by the job (e.g., "The beautician put herself through school" vs. "The beautician put himself through school"). They found increased N400 responses in association with the prejudice violation.

Recently, Proverbio and coauthors (2017) used ERPs to investigate the detection of a discrepancy between gender-based occupational stereotypes and written material presented to 15 Italian viewers in a completely implicit task. No awareness or judgment about stereotypes was involved, no decision had to be made on sentence acceptability or congruence, and no prime words related to gender were presented (which might reveal the matter of the investigation). EEG was recorded while participants were engaged in a task that consisted in quickly pressing a response key to animal words while ignoring the overall study's purpose. Two hundred forty sentences that did or did not violate gender stereotypes were presented randomly mixed with 32 other sentences ending with an animal word. Final words violating gender stereotypes (such as "The notary is BREASTFEEDING" or "Here is the commissioner with HER HUSBAND") elicited a greater anterior N400 response and left anterior negativity (LAN) than words conforming to the gender stereotype (e.g., "The chemist put on a nice TIE") (see Fig. 12.5). LAN modulation suggests that gender stereotypes are processed automatically (as if they were morphosyntactic errors) and hints at how they are deeply rooted in our linguistic brain.

According to the inverse solution applied to incongruent minus congruent ERP difference waves recorded in the 350–450 time window, which corresponds to the N400 peak, the neural representation of gender-based stereotypes mostly involved the middle frontal gyrus (MFG), which is known to support the neural representation of stereotypes. The temporal/parietal junction (TPJ) supporting theory of mind (TOM) processes was also engaged, along with the superior and middle temporal gyri (STG and MTG) representing person information. The TPJ has been associated with the ability to attribute intentions and meanings to the behavior of others, which is part of TOM (Saxe, 2010; Young et al., 2010).

According to the neuroimaging literature, the medial frontal cortex (mdFC) represents social information that refers to others, particularly outgroup stereotyping and prejudice (Mitchell et al., 2006). In particular, sub-regions of the medial prefrontal cortex (mdPFC) would differentiate between thinking about the attributes and mental states of similar versus dissimilar others (Mahy et al., 2014). In a recent study on the neural bases of prejudice, it was found that the left cortical superior frontal gyrus (SFG, BA10) was particularly involved in representing negative prejudices related to others (Proverbio et al., 2016), which strongly fts with the current fndings.

In that study, the neural bases and functional properties of social prejudices were investigated. During social interactions, we make inferences about people's personal characteristics based on their appearance. These inferences form a potential prejudice that can positively or negatively bias our interaction with them. This

**Fig. 12.5** N400 and LAN components elicited by incongruent (with respect to stereotypes) sentences over anterior scalp sites in Proverbio et al. (2017) study. (Courtesy of the authors)

ability was investigated by recording event-related potentials from 128 scalp sites in 16 volunteers. In the frst session (encoding), they viewed 200 faces associated with a short fctional story that described anecdotal positive or negative characteristics about each person (see an example in Fig. 12.6).

In the second session (recognition), participants underwent an old/new memory test, in which they had to distinguish 100 new faces from the previously shown faces. ERP data relative to the encoding phase showed a larger anterior negativity in response to negatively (vs. positively) biased faces, indicating a deeper neural processing of faces with unpleasant social traits. In the recognition task, ERPs recorded in response to new faces elicited a larger FN400 than to old faces and to positive than negative faces. This piece of data indicates that negatively valenced faces were recognized as more familiar than positively valenced ones. Additionally, old faces elicited a larger old-new parietal response than new faces, in the form of an enlarged late positive component (LPC). An inverse solution swLORETA (applied to ERPs in the 450–550 ms poststimulus) indicated that remembering old faces was associated with the activation of right superior frontal gyrus (SFG), left middle temporal gyrus (mdTG), and right fusiform gyrus (FG). However, only negatively connoted

**Fig. 12.6** Examples of how Proverbio et al. (2016) induced a positive or negative prejudice about previously unknown persons. In the encoding task, faces were presented in association with a short story that provided fctional information about the character, such as an anecdote or personal information. The biographic information could be positive, thereby inducing a positive prejudice toward the depicted character, or vice versa, a negative prejudice could induce a negative bias. (Courtesy of Proverbio and coauthors)

**Fig. 12.7** Sagittal views of active sources during processing of negatively biased, positively biased, and new faces according to swLORETA analysis during the 450–550 ms time window. The images highlight the strong activation of the left middle frontal gyrus during memory recall of faces associated with a negative prejudice. (Taken from Proverbio et al. (2016) with permission from the authors. Creative Commons Public Domain picture)

faces strongly activated the limbic and parahippocampal areas and the left SFG (Fig. 12.7). Dissociation was found between familiarity (modulated by negative bias) and recollection (distinguishing old from new faces). Not only ERPs showed the existence of prejudices formed during the learning phase, but the latter were able to affect the recognition and memory recall of faces, with an advantage for negatively valenced social information.

Going back to gender-based prejudices, quite recently, Proverbio and coauthors (2018) showed that ERPs are so sensitive to social representations and constructs such as prejudices and stereotypes that it is possible to fnd differences within the population as a function of the different degree of prejudice possessed. In this study, the time course and the neural correlates involved in the representation of

occupational gender bias were investigated by addressing two questions: frst, if the bias varied as a function of participant's sex and, second, if there was a difference based on the gender of the character depicted in the phrases presented to participants. Sentences were created in a way that the gender of the character engaging in a given professional activity or behavior was made explicit only at the very end of the sentence (*fnal word paradigm*). An implicit paradigm was chosen to trigger the automatic activation of any mental function involved in the processing of gender stereotypes. This was carried out by recording electrophysiological responses in heterosexual Italian university students during the reading of hundreds of sentences depicting female and male characters and their professional attitudes (see Table 12.1 for some example of sentences carrying typical female or male stereotypes). The task consisted in responding as quickly and accurately as possible to animal words, that is, an implicit task designed in order to avoid social desirability processes. Brain responses of male and female participants totally unaware of study's purpose were compared as a function of whether the sentence was congruent or not with a gender stereotype.

EEG was recorded from 128 sites in 38 Italian participants. While looking for rare animal words, participants read 240 sentences, half of which expressed notions congruent with gender stereotypes, and the other half did not. ERPs were timelocked to critical words. Findings showed enhanced anterior N400 and occipitoparietal P600 responses to items that violated gender stereotypes, mostly in men (Fig. 12.8). The swLORETA analysis applied to N400 potentials in response to incongruent phrases showed that the most activated areas during stereotype processing were the right middle temporal (mdTG) and middle frontal gyri (mdFG), as well as the TPJ, as expected on the basis of previous literature (Fig. 12.9). The data hint at a gender difference in stereotyping, with men being more prejudicial especially when the depicted character was a male. One possible interpretation of these fndings relies on the asymmetrical nature of occupational stereotypes, mostly rooted in the principle that females could not perform male professions because of a lack of strength or powerful attitude. Therefore, it is conceivable that women participants might disagree more easily with the stereotype being themselves women.

An asymmetry in gender bias, with a stronger prejudicial attitude in men, is not unknown in the literature. For example, an article summarizing data from more than 2.5 million completed IATs (Implicit Association Tests) and self-reports (Nosek & Smyth, 2007) showed that men are more prejudicial in terms of theories postulating that they have more social dominance (e.g., Sidanius & Pratto, 1999), attitudes toward gay vs. heterosexual people (e.g., Negy & Eisenman, 2005), and attitudes toward black vs. white people (e.g., Qualls et al., 1992). As for neuroscientifc data, only in men it was shown that hostile sexism correlated with the activation of brain regions associated with mental state attributions (such as the medial prefrontal cortex (mdPFC), the posterior cingulate cortex (pCC), and the temporal poles) in Cikara et al.'s (2011) fMRI study.

Overall ERPs, and especially the amplitude of N400 component, proved to be extremely sensitive to violations of implicit stereotypes, thus allowing to tap at the representation of social attributes (such as stereotypes and prejudices) without the **Table 12.1** Example of sentence stimuli, relative to men or women, and violating or not current occupational gender stereotypes in which women engage more in care-related professions and men in strength−/power-related professions


Because these stereotypes are a part of everybody's cultural heritage and learned early in life, people form implicit gender stereotypes, which automatically associate men and women with stereotypical traits, abilities, and roles, even when they disavow these traditional beliefs (e.g., Nosek et al. 2002). For instance, women are typically stereotyped as being nicer (Eagly & Mladinic, 1989) and are more likely to enact subordinate roles that require communal traits. The presence of gender stereotyping has been demonstrated for an extensive list of role nouns in Czech, English, French, German, Italian, Norwegian, and Slovak by Misersky et al. (2014). To determine whether the sentences actually represented (or violated) stereotypes for university students living in the Milan metropolitan area, the stimuli underwent validation, in which a group of Milan University students were asked to rate, by means of a 3-point Likert scale, how they reacted to reading the terminal word of the phrase. Scale units were as follows: 0 = Actually, I was a bit surprised. 1 = I do not know. 2 = I kind of saw that coming.

**Fig. 12.8** Isocolor topographical maps (front view) of surface voltage measured in the 250–400 ms temporal window (N400 latency range) to incongruent stimuli as a function of participants' sex. It can be appreciated how N400 response to stereotypes violation was not found in female participants. This suggests that female participants were not surprised by fnal words that violated sex stereotypes. (Adapted from Proverbio et al., 2018)

**Fig. 12.9** Coronal and axial brain sections showing the location and strength of electromagnetic dipoles explaining the surface difference voltage obtained by subtracting ERPs to congruent from ERPs to incongruent stimuli in the 250–400 ms latency range, corresponding to the peak of N400. *L* left, *R*, right, *A* anterior, *P* posterior, *MTG* middle temporal gyrus, *MFG* middle frontal gyrus. (Taken for Proverbio et al., 2018)

problems related to social desirability processes and without participants being minimally aware of the study's purpose or experimental manipulation. For this reason, ERPs represent one core research technique for studying social cognition, including the representation of social attributes.

Indeed, N400 paradigm was also used for detecting implicit ethnic prejudices such as negative biases against rural migrant workers (Wang et al., 2011), unarmed Afro-Americans individuals (Correll et al., 2006), or non-Caucasian (other race (OR)) professionals (Brusa et al., 2021). For example, Brusa et al. (2021) presented to Caucasian students 285 sentences that could either violate, non-violate, or be neutral with regard to stereotypical concepts concerning OR individuals (e.g., Asians, Africans, Arabs). No awareness or judgment about stereotypes was required. Participants passively read the sentences while engaged in a fctitious task, ignoring the overall study's purpose. Stimuli violating negative ethnic stereotypes elicited a large anterior N400 response, and participant's individual amplitude values of the N400-Difference Wave (Incongruent – Congruent) showed a direct correlation with the individual racism scores obtained at the *Subtle and Blatant Prejudice Scale*, administered at the end of the experimental session. The stronger the racial bias, the larger the N400 response.

These fndings encourage the use of subjective, implicit, and explicit psychological scales to be correlated with physiological measures in the study of social stereotypes. Indeed, while the N400 paradigm allows to implicitly access the representation of racial or sexual stereotypes avoiding the activation of control processes guided by social desirability instances, the correlation between electrophysiological and behavioral measures can provide a wider and more complex view about psychological processes. Prejudices can exist (as demonstrated by electrophysiological signals) in the absence of conscious awareness and in contrast to a convinced voluntary progressive attitude of individuals.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 13 Brain Imaging Methods in Social and Affective Neuroscience: A Machine Learning Perspective**

#### **Lucas R. Trambaiolli, Claudinei E. Biazoli Jr, and João R. Sato**

**Abstract** Machine learning (ML) is a subarea of artifcial intelligence which uses the induction approach to learn based on previous experiences and make conclusions about new inputs (Mitchell, Machine learning. McGraw Hill, 1997). In the last decades, the use of ML approaches to analyze neuroimaging data has attracted widening attention (Pereira et al., Neuroimage 45(1):S199–S209, 2009; Lemm et al., Neuroimage 56(2):387–399, 2011). Particularly interesting recent applications to affective and social neuroscience include affective state decoding, exploring potential biomarkers of neurological and psychiatric disorders, predicting treatment response, and developing real-time neurofeedback and brain-computer interface protocols. In this chapter, we review the bases of the most common neuroimaging techniques, the basic concepts of ML, and how it can be applied to neuroimaging data. We also describe some recent examples of applications of ML-based analysis of neuroimaging data to social and affective neuroscience issues. Finally, we discuss the main ethical aspects and future perspectives for these emerging approaches.

**Keywords** Brain imaging methods · Machine learning · Neuroscience machine learning · Emotion/affective decoding · Neurofeedback

L. R. Trambaiolli (\*)

Basic Neuroscience Division, Mclean Hospital – Harvard Medical School, Belmont, MA, USA e-mail: ltrambaiolli@mclean.harvard.edu

C. E. Biazoli Jr · J. R. Sato Center for Mathematics, Computing, and Cognition, Federal University of ABC, São Bernardo do Campo, Brazil

#### **Introduction**

Machine learning (ML) is a subarea of artifcial intelligence which uses the induction approach to learn based on previous experiences and make conclusions about new inputs (Mitchell, 1997). In the last decades, the use of ML approaches to analyze neuroimaging data has attracted widening attention (Pereira et al., 2009; Lemm et al., 2011). Particularly interesting recent applications to affective and social neuroscience include affective state decoding, exploring potential biomarkers of neurological and psychiatric disorders, predicting treatment response, and developing real-time neurofeedback and brain-computer interface protocols. In this chapter, we review the bases of the most common neuroimaging techniques, the basic concepts of ML, and how it can be applied to neuroimaging data. We also describe some recent examples of applications of ML-based analysis of neuroimaging data to social and affective neuroscience issues. Finally, we discuss the main ethical aspects and future perspectives for these emerging approaches.

#### **Brain Imaging Methods**

Most neuroimaging experiments in human social and affective neuroscience are based on two groups of techniques (Fig. 13.1) (Min et al., 2010). The frst group comprises measurements of either electrical or magnetic features associated with the electrophysiological activity of neuronal assemblies. This group includes the electroencephalography (EEG) and the magnetoencephalography (MEG) data acquisitions. On the other hand, the second group comprises measurements of metabolic or hemodynamic features that are indirectly associated with neural activity. This second group of neuroimaging techniques includes functional magnetic resonance imaging (fMRI), functional near-infrared spectroscopy (fNIRS), and positron emission tomography (PET).

**Fig. 13.1** Electromagnetic-based imaging approaches (left) use electric or magnetic sensors to capture the electromagnetic resultants from the neuronal and synaptic activity. Hemodynamicbased procedures (right) use light or magnetic sensors to measure the cerebral blood fow and oxygen consumption levels

Among the electromagnetic approaches, the EEG uses electrodes positioned over the scalp to record the sum of excitatory and inhibitory postsynaptic potentials in which the resulting dipoles are perpendicularly aligned to the scalp (Niedermeyer & da Silva, 2005). In consequence, its spatial resolution is limited and further compromised by volume conduction effects. However, its simplicity, low cost, and high temporal resolution (reaching the order of kilohertz in modern systems) make it one of the most common techniques in social and affective experiments. Similarly, MEG signals are resultant from the magnetic feld generated by postsynaptic currents in apical dendrites (mainly those tangential to the skull) (Hansen et al., 2010). Despite presenting some mapping limitations similar to the EEG, MEG has a better spatial resolution, though restricted to superfcial cortical sulci activity. Moreover, its higher cost and less availability when compared to EEG result in relatively fewer studies in human affective neuroscience using this technique (Min et al., 2010).

PET scanning is the pioneering metabolic and hemodynamic imaging approach. This technique uses an injected radioactive tracer to track brain tissue variations on blood fow and metabolic features associated with local neural activity (Maquet, 2000). However, with the emergence of noninvasive fMRI protocols, which did not depend on exogenous tracers, PET experiments became relatively less common in current research. The fMRI uses the paramagnetic properties of the deoxyhemoglobin molecules, which work as an endogenous tracer, to measure the blood-oxygenlevel-dependent (BOLD) contrast effect (Ogawa et al., 1990). Both PET and fMRI acquisitions provide the highest spatial resolution among the brain imaging approaches, allowing the evaluation of both cortical and subcortical structures associated with social behavior and affective states (Liu et al., 2015). The worldwide availability of MRI scanners in clinical settings made it the most used neuroimaging technique in the last two decades. Among fMRI, main limitations in affective and social process research when compared with other approaches are its lower temporal resolution, scanner noise, and the setup that restrict movement (Doi et al., 2013). Hence, fMRI acquisition does not allow more naturalistic, out-of-the-laboratory protocols. As a complementary hemodynamics-based technique for more naturalistic settings, the fNIRS has the advances of portability, low cost, and a relatively good temporal-spatial ratio (Doi et al., 2013). This technique measures the absorption of near-infrared light by oxyhemoglobin and deoxyhemoglobin molecules in superfcial layers of the brain tissue, during local neural activity (Ferrari & Quaresima, 2012). However, fNIRS acquisitions only cover brain layers close to the scalp, as is the case with MEG (Min et al., 2010), and with a sparse representation limited by the optodes arrangement.

In sum, each neuroimaging modality has advances and disadvantages, and the choice for a particular technique should be based on the specifc research question. More recently, the use of multimodal setups emerged as a promising approach in the neuroimaging feld. These approaches use two or more neuroimaging techniques aiming to combine its advantages and provide complementary and convergent information regarding the underlying neural phenomena (Liu et al., 2015). The most common combination involves at least one electromagnetic and one hemodynamic approach, such as EEG-fMRI, EEG-fNIRS, or EEG-fNIRS-fMRI. However, combinations into the same group of techniques are possible, such as EEG-MEG and fNIRS-fMRI.

#### **Basic Concepts of Machine Learning**

The primary aim of a machine learning algorithm is to *learn* (i.e., extract knowledge) from an original dataset (training set), validate its ability to make predictions in an independent dataset (validation set), and then make decisions or predictions in new samples (test set) (Mitchell, 1997). During the learning process, the decision model bases its conclusions on patterns observed on the features of the examples in the training set. Such features might include, for example, frequencies of neural activity during specifc tasks, event-specifc potentials for a particular set of stimuli, or the connectivity level between different brain areas (Rubinov & Sporns, 2010; Sakkalis, 2011).

#### *Learning Process*

Three main approaches might be used to guide the learning process, according to the presence or absence of labels for each example (i.e., instance or subject in the dataset) (Fig. 13.2). The frst approach (which will be the focus of this chapter) is the supervised learning, where each instance has a corresponding label (e.g., patient

**Fig. 13.2** Supervised learning methods use labeled examples to learn from data, while unsupervised learning methods extract patterns from data using unlabeled inputs. The recently proposed semi-supervised approach, otherwise, combines both labeled and unlabeled inputs during the learning process

or healthy subject). In this case, the objective is to develop models which can predict the desired labels with minimal error (Larranaga et al., 2006). Thus, during the learning process, the algorithm continually evaluates and adjusts the decision model until it reaches a near-to-optimal performance (Kuhn & Johnson, 2013). In unsupervised procedures, on the other hand, labels are not provided during the learning process. Here, the aim is to extract patterns exclusively based on similarities among groups of features (usually grouping examples according to these measures) (Larranaga et al., 2006). Finally, the third approach merges the characteristics from both previous methods. In this so-called semi-supervised approach, both labeled and unlabeled examples are used during the learning process. This approach takes advantage of the higher precision from the labeled training, as well as the lower computational cost from the non-labeled training (Cohen et al., 2004).

#### *Validation Procedures*

To converge in optimal decisions during the learning process, the decision model is continuously tested with a second dataset (i.e., the validation set) and, if necessary, remodeled using the training set (Kuhn & Johnson, 2013). The best approach for this procedure would be to train and validate the model with as much data as possible. However, due to experimental design constraints or to limited sample sizes, this task is commonly performed using somewhat suboptimal datasets (Lemm et al., 2011). Critically, to avoid variance and bias, the processes of training, validating, and testing the model *should not* be performed on the same data (Pereira et al., 2009). Different validation approaches have been proposed to overcome the issues raised by using limited datasets. One popular strategy for experiments using supervised learning is the cross-validation method (Lemm et al., 2011). In this approach, a small sample of the dataset is frst split to be used as the test set, while the remaining part is further splitted into the train and the validation sets (Fig. 13.3a). This partitioning procedure is repeated several times to create different samples for each iteration (Lemm et al., 2011).

Different partitioning schemes might be used for this division. For example, in k-fold cross-validation (Fig. 13.3b), the dataset is divided into k disjoint subsets with equal size. Then, k-1 folds are used to train the model, and the remaining one is used for validation. This last step is repeated k times until all subsets are used as the validation set (Pereira et al., 2009). Another popular approach is the leave-oneout cross-validation (Fig. 13.3c), which is a particular case of k-fold cross-validation where k is equal to the number of examples.

Finally, in Monte Carlo cross-validation (Kuhn & Johnson, 2013) (Fig. 13.3d), the train and validation sets are composed by a fxed number of examples (e.g., X% for training and 100-X% for validating). Then, samples are randomly selected to form each set. This procedure might be repeated until all combinations are tested (high computational cost) or up to a predetermined number of permutations.

**Fig. 13.3** Different steps and approaches for data splitting. (**a**) The frst step of the validation process is to select a sample subset for testing purposes. Then, cross-validation approaches are used to split the remaining data into training and validation subsets. (**b**) During the k-fold crossvalidation, data is split into k-folds of similar lengths. Then, the algorithm is validated k times, until all folds were used as the validation subset. (**c**) The leave-one-out cross-validation is a particular case of k-fold cross-validation, where each fold corresponds to a single example. (**d**) The Monte Carlo cross-validation performs a predetermined number of combinations, where the validation subset is composed of a fxed quantity of randomly selected samples

#### *Dimensionality Reduction*

In contrast to a limited number of examples, supervised models usually have a wide range of features associated with them. This growing abundance of assessed features relates to the improvement of brain imaging technologies and the development of new feature extraction methods. However, contrary to a common belief, the increasing high dimensionality of neuroimaging datasets does not necessarily lead to improved ML models. Indeed, much of these new features are redundant or irrelevant to the model design and might even cause a decrease in performance (Guyon & Elisseeff, 2003). With this in mind, dimensionality reduction strategies became a fundamental step for model building (Lemm et al., 2011).

As the learning approaches, feature selection (FS) methods can be grouped into unsupervised and supervised categories. The common spatial pattern (CSP), an example of supervised method, uses the class label to search for an optimal and reduced subset of features, where the maximum of relevant information is held (Lemm et al., 2011). On the other hand, unsupervised methods, such as the principal component analysis (PCA) and the independent component analysis (ICA), are mainly used for dimensionality and noise reduction based on projections to the more relevant factors or based on grouping of effects (Lemm et al., 2011). However, unlike the supervised category, unsupervised methods often require manual selection of relevant factors or groups.

Over the last decades, supervised FS methods have become popular in neuroscience (Huang, 2015). To select these optimal subsets of features, some topics should be established, such as the search strategy and the level of interaction with the ML algorithm.

Regarding the search strategy, two main approaches are possible, according to the subset composition. For the frst strategy, all features are sorted according to some relevance criteria. Then, only those features with higher positions are selected to compose the subset (Huang, 2015). On the other strategy, subgroups are created with random features from the original feature set. Then, these subsets are evaluated according to its capacity to describe the whole dataset (Huang, 2015). The ideal FS algorithm would explore all combinations available to compose the feature subsets (i.e., to perform an exhaustive search) (Guyon & Elisseeff, 2003). However, due to the complexity of the problem and to computational limitations, it is common to establish a stop criterion that defnes when the algorithm decides for one subset of features (e.g., when the model reaches a specifc performance threshold or when the subset reaches a particular amount of features) (Guyon & Elisseeff, 2003).

According to the level of interaction with the ML model, feature selection algorithms might also be grouped into three approaches (Kohavi & John, 1997) (Fig. 13.4): flter, wrapper, and embedded. The flter approach is the most commonly used procedure. In this, the feature selection is performed before and independently to the model induction (Fig. 13.4a). For the wrapper approach, every feature set is submitted to the ML algorithm, and the model performance is used to evaluate the selected subset (Fig. 13.4b). Finally, embedded approaches merge the feature selection and the model induction steps, with the subsets being created internally by the ML model (Fig. 13.4c).

#### *Types of Classifers*

Different types of classifers are defned according to the specifc assumptions made during the learning process (Pereira et al., 2009). For example, logic-based algorithms create successive layers in which instances are classifed according to the values of a single feature. These algorithms might be described as a decision tree which is composed by nodes and branches (Fig. 13.5a). Each node has a particular rule that divides the instance into different branches according to the corresponding feature value (Murthy, 1998). The frst node of the tree is the feature that best separates the training data, followed by nodes ordered by a decreasing predictive power until no more rules become necessary to classify the dataset correctly. This kind of algorithm tends to perform better when dealing with categorical features (Kotsiantis, 2007).

In perceptron-based algorithms, the perceptron calculates a linear combination of the input features and, further, sum all weighted inputs to make a decision. When the result is higher than a specifed threshold, the instance is labeled as class A or marked as class B otherwise (Mitchell, 1997). These weights are randomly

**Fig. 13.4** Level of interaction between the feature selection algorithm and the classifer. (**a**) During the flter approach, the feature selection is performed before and apart from the classifer. (**b**) During the wrapper approach, every single feature subset is submitted to the classifer, and the classifcation performance is used to evaluate the sample. (**c**) During the embedded procedure, both the feature selection and the classifer algorithms are merged and happen simultaneously

established at frst but optimized during the learning process until they reach nearto-optimal predictions (Mitchell, 1997). The perceptron approach, however, can only classify linearly separable inputs (Kotsiantis, 2007). To perform nonlinear discrimination, the use of artifcial neural networks (ANN) was proposed. In this, multiple perceptrons are combined creating a complex network where the output from one single perceptron might be used as an input for several other perceptrons (Fig. 13.5b) (Zhang, 2000).

Unlike other classifers, statistical-based algorithms provide the probability of the evaluated instance belonging to any given class (Kotsiantis, 2007). A classic example of this group of algorithms is the linear discriminant analysis (LDA) which explores linear combinations of features that best label instances into the desired classes (Fig. 13.5c) (Balakrishnama & Ganapathiraju, 1998).

Finally, support vector machines (SVM) compose a non-probabilistic method inspired by statistically based approaches. In this case, data is separated into two classes by a hyperplane (Vapnik, 1995). This hyperplane is defned trying to

**Fig. 13.5** Examples of classifers commonly applied to neuroimaging studies. (**a**) A decision tree, (**b**) artifcial neural networks, (**c**) linear discrimination analysis, (**d**) support vector machines

maximize its distance (margin) to the instances on either category (Fig. 13.5d) and, consequently, reducing the expected generalization error (Cristianini & Shawe-Taylor, 2000). For the classifcation of non-separable data, the dataset might be translated onto a higher-dimensional space using kernel methods, to apply the SVM-designed hyperplane (for more details about kernel methods, please refer to Cristianini & Shawe-Taylor, 2000).

Although multiclass classifcation approaches have been architected for the previously listed classifers, binary classifcation (e.g., task vs. control group, task A vs task B, etc.) is most commonly applied in social and affective neuroscience studies.

#### *Evaluating and Interpreting a Machine Learning Model*

One easy way to evaluate the performance of a binary classifer is the use of a confusion matrix (or error matrix) (Sokolova & Lapalme, 2009). This matrix represents the relation between the actual and the predicted classes (Fig. 13.6a). Four main measures might be extracted from this matrix (Sokolova & Lapalme, 2009). The frst measure, named accuracy, is the ratio between the number of examples

**Fig. 13.6** Illustrative example of (**a**) a confusion matrix and (**b**) three different examples of ROC curves representing classifers with excellent (dotted line), good (dashed line), and bad (continuous line) performances

correctly predicted (true positives and true negatives) by the total of samples available. The second is named precision, which is the ratio between the number of true positives by the total of examples predicted as positive (true and false positives). Sensitivity is the ratio between the number of true positives by the total of positive examples (true positives and false negatives), while specifcity is the ratio between the number of true negatives by the total of negative samples (true negatives and false positives).

In general, an optimal model should present high sensitivity and specifcity. However, real-world datasets tend to show an unbalance between these measures. To evaluate this aspect, the receiver operating characteristic curve (ROC curve) presents an illustrative plot of the discriminant ability of the binary classifer for different thresholds (Fawcett, 2006). This curve is plotted using the sensitivity of the classifer as the y-axis and the fall-out (i.e., 1-specifcity) as the x-axis (Fig. 13.6b). Thus, the area under the ROC curve (AUC) describes the probability that the classifer will rank a random positive instance higher than a random negative example (Fawcett, 2006). In other words, when comparing the AUC of different classifers, the higher the AUC, the better is the classifer average discriminative power.

Finally, linear classifers such as the LDA and the linear SVM present weights relative to each variable. These weights describe how relevant each variable is to identify each class (Sato et al., 2009). In addition to performance measures, this information adds valuable clues regarding the neural basis of the studied mental process. For example, that specifc frequencies in some brain areas are more related to one affective state than the other or that the volume of a subcortical structure might be a predictor of a given psychiatric disease.

Besides the evaluation methods listed in this chapter, other performance metrics might be used according to the characteristics of the ML algorithm and the experimental design. For a comparative review, please refer to Sokolova and Lapalme (2009).

#### **ML Applications in Social and Affective Neuroscience**

#### *Computer-Aided Diagnosis*

Psychiatric disorders are defned by the presence of specifc set of symptoms. However, some symptoms are shared across disorders and a single patient might satisfy criteria for multiple disorders, or do not ft the requirements for any precise diagnosis (Huys et al., 2016). In this context, an increasingly popular application of ML in social and affective neuroscience is in the quest for imaging biomarkers of psychiatric disorders. This popularity is due to a recent focus on individualized medicine. Although classical statistical approaches provide biomarker descriptions at the group level, physicians should make clinical decisions about individuals (Orru et al., 2012). Thus, ML has been an active area of research to the development of potential computer-aided individualized diagnosis methods.

From this perspective, the use of structural MRI data combined with ML approaches is presenting promising results for the better comprehension of the obsessive-compulsive disorder (OCD). For example, Soriano-Mas et al. (2007) successfully classifed patients with OCD from healthy control with more than 90% of accuracy based on brain structural features. Also, these data were used to predict the severity of obsessive-compulsive symptoms (Hoexter et al., 2013), as well as to list potential biomarkers using dimensionality reduction approaches (Trambaiolli et al., 2017).

In depressive spectrum disorders, structural MRI also achieved accuracies around the 90% threshold when classifying patients and controls (Mwangi et al., 2012), while functional MRI successfully discriminated between bipolar and unipolar depression with similar performances (Grotegerd et al., 2013). Also, structural and functional variations in affective-related brain regions, such as the amygdala, the insula, and the cingulate cortex, predicted symptom severity and treatment response (Siegle et al., 2006; Chen et al., 2007). Similarly, ML predictive approaches effciently predicted the treatment response from patients with anxiety disorder for both pharmacological (Whalen et al., 2008) and cognitive behavioral (Doehrmann et al., 2013) therapies. However, it is important to emphasize that such fndings had not yet reached clinical signifcance and are not currently incorporated in psychiatric practice.

#### *Emotion/Affective Decoding*

Brain decoding is the identifcation of someone's mental states based exclusively on measurements of their brain activity (Haynes & Rees, 2006). This stands on the idea that different neural activity patterns are associated with different mental states. Thus, decoding these patterns might be fundamental for our understanding of the neural basis of human cognition (Haynes & Rees, 2006). In this context, the ability from ML methods to identify and learn from patterns makes it a quite suitable approach for affective brain decoding.

A spectral power asymmetry over the frontal regions during emotion elicitation is a classical effect reported from EEG data analysis (Balconi et al., 2015). Applying an ML approach, Wang et al. (2014) reached more than 80% of predictive accuracy when distinguishing between positive and negative affective valences. Similar classifcation results were reported using fNIRS recordings over the prefrontal cortex when comparing positive or negative affective states with neutral states (Trambaiolli et al., 2018a). Also, the prefrontal activity even during resting state seems to be related with the emotional processing, since resting state frontal asymmetry predicts responsiveness to affective elicitation (Balconi et al., 2015).

However, human emotions involve complex networks comprising areas not accessed by the EEG or fNIRS spatial sampling and resolution. Using fMRI data, Baucom et al. (2012) achieved up to 90% accuracy in single participant classifcation between positive and negative valences using voxels from the medial and the ventrolateral prefrontal cortex, anterior cingulate, and amygdala, among other regions. Later, Lindquist et al. (2016) developed a meta-analytic study compiling data from 397 functional studies and different ML learning methods to investigate different hypotheses of network organization during the elicitation of affective valence. Their evidence suggests a single network composed by areas such as the dorsomedial prefrontal cortex, ventrolateral prefrontal cortex, supplementary motor area, anterior insula, amygdala, ventral striatum, and thalamus, which respond both for positive and negative valence, but with different patterns of activation depending on the affective state (Lindquist et al., 2016).

#### *Neurofeedback*

Due to the recent success of ML in decoding different mental states, this approach was also used to develop therapeutic applications, such as neurofeedback. Neurofeedback is a real-time procedure where a feedback of the neural activity in specifc neural substrates is provided to the volunteer aiming to achieve the selfregulation of these areas or networks (Sitaram et al., 2017). Specifcally, affective neurofeedback targets substrates related to emotional processing (Trambaiolli et al., 2018b) and might be useful as a nonpharmacological treatment for psychiatric symptoms or disorders, such as schizophrenia, major depressive disorder, attentiondefcit/hyperactivity disorder, and obsessive-compulsive disorder (Fovet et al., 2015).

Different imaging methods allow different approaches to control affective networks. On the one hand, electrophysiological methods usually aim to control specifc frequency bands in particular subsets of electrodes (Begemann et al., 2016; Enriquez-Geppert et al., 2017). For example, EEG alpha asymmetry in frontal electrodes was tested to reduce depressive symptoms, while central beta suppression and theta enhancement were applied to minimize inattention and impulsivity symptoms (Begemann et al., 2016).

On the other hand, hemodynamic methods use the upregulation or downregulation of the local blood fow in specifc targets (Sulzer et al., 2013). For example, depressive patients who achieved self-control of the amygdala through fMRI-based neurofeedback showed reduced indices of anxiety and increased indices of happiness (Young et al., 2014), as well as a positive correlation between the symptom improvement and the reorganization of amygdala functional connectivity after the neurofeedback training (Young et al., 2018).

#### *Social Neuroscience*

Despite the indisputable importance of living in a structured society for human affective and cognitive processes, how the human brain works throughout simple to complex social contexts remains largely elusive (Babiloni & Astolf, 2014).

In current social neuroscience, the possibility of simultaneously recording brain activity of two or more people interacting (i.e., hyperscanning) and of conceptualizing the connectivity emerging from such interactions (i.e., hyperconnectivity) has gained momentum (Montague et al., 2002). In this context, ML algorithms could be applied to modeling some level of a causal relation in social interactions mediated by interactions in brain activities (Konvalinka & Roepstorff, 2012). Anders et al. (2011) used fMRI records to predict the level of neural activity in romantic partners while experiencing the same emotional feelings. For this, the model was trained using data from one partner and used to successfully estimate the brain functional activation pattern of the other partner.

Another appealing feld of research questions using ML approaches is the investigation of the neural correlates complexing social preferences and behaviors, such as friendship or engagement with political ideologies. For example, Kanai et al. (2011) applied a classifer to differentiate between participants with self-declared conservative or liberal political ideologies. Using the gray matter volume of the anterior cingulate and the right amygdala as inputs, the classifer reached near to 70% accuracy (Kanai et al., 2011). In another study, liberal and conservative participants were classifed using functional MRI data with remarkable AUC values of more than 98% (Ahn et al., 2014).

#### **Future Perspectives and Ethical Aspects**

During the last decade, the neuroimaging community is making a continuous effort to create structured and standardized publicly available datasets, covering a wide range of samples and experiments (Poldrack & Gorgolewski, 2014). This action is fundamental to the development of optimized models for computer-aided diagnosis, for example. With larger samples, population heterogeneity, and standardized protocols, new ML models will be less susceptible to outliers and noise infuence and will present higher generalization power (Schnack & Kahn, 2016). The extensive information resulting from these datasets will allow the use of ML approaches to confrm or to explore new aspects regarding the neural basis of affect and social interactions.

A promising instrumental evolution is the development of portable imaging devices, such as wearable EEG and fNIRS systems (Piper et al., 2014; von Lühmann et al., 2017). This technology allows studies outside the laboratory environment, leading to the observation of how the social brain acts in real-life situations (Balardin et al., 2017). Although ML algorithms should be adapted to deal with new levels of physiological (e.g., movement-related artifacts) and environmental (e.g., diverse magnetic felds) noises, a new range of naturalistic responses will be available for analysis. Neurofeedback applications would also be benefted by portable devices, with the possibility of location-independent training or the passive control of affectdriven software or equipment.

Another exciting prospect is the use of ML to develop new concepts of social interaction, such as the named collaborative brain-computer interfaces (BCI) (Wang & Jung, 2011). Following the idea of neurofeedback, in BCI, the user intends to control a computer exclusively based on their brain activity (Sitaram et al., 2017). Thus, collaborative BCI uses brain waves from multiple users to control one single machine, leading to increased task performances as high is the number of participants (Wang & Jung, 2011). Still, in the context of BCI, other social environments were created with the assistance of ML algorithms. For example, Rao et al. (2014) proposed the brain-to-brain interface in humans, where the EEG signals from one user were used to stimulate the brain of a second subject through transcranial magnetic stimulation (TMS). Later, this concept was expanded for the idea of a "brainnet," where the signals of some users (senders) were collaboratively merged to stimulate the brain of an independent participant (receiver) (Jiang et al., 2018).

The advance of ML applications in affective and social neuroscience also raises some ethical concerns. In clinical settings, for instance, the use of ML algorithms will only be possible after careful evaluation and when proper evidence for improvement in either diagnosis accuracy or treatment effcacy is in place. To date, no conclusion or clinical decision should be taken exclusively based on the ML output, and future applications surely will depend on the integration of ML procedures to expert knowledge (Fu & Costafreda, 2013). Decoding affective states is an essential tool for the understanding of the brain basis of the human mind, as well as for the development of therapeutic approaches such as neurofeedback. However, an essential ethical and legal aspect regarding brain decoding applications is ensuring privacy or non-consented commercial use of data or decoding results (Haynes, 2011).

#### **Final Considerations**

In this chapter, we introduced concepts of brain imaging and ML methods. Aside from describing learning and validation methods, dimensionality reduction and feature selection approaches, performance estimations, and currently popular classifers, we purposefully focused on supervised methods. This choice was based on the facts that these are the best examples for an initial overview of the ML topic and the most popular approach in neuroimaging studies. We also described some uses of ML to social and affective neuroscience problems, from basic investigations to clinical and therapeutic applications. Promising prospects were also mentioned to contextualize the reader to cutting-edge advances in this area. Finally, we also highlighted some ethical aspects that might be carefully considered when developing applications of ML in social and affective neuroscience.

**Acknowledgments** JRS is grateful to Sao Paulo Research Foundation (FAPESP, Grants #2018/04654-9, #2018/21934-5 and #2021/05332-8).

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 14 fMRI and fNIRS Methods for Social Brain Studies: Hyperscanning Possibilities**

**Paulo Rodrigo Bazán and Edson Amaro Jr**

**Abstract** Recently, the "social brain" (i.e., how the brain works in social context and the mechanisms for our social behavior) has gained focus in neuroscience literature – largely due to the fact that recently developed techniques allow studying different aspects of human social cognition and its brain correlates. In this context, hyperscanning techniques (Montague et al., Neuroimage 16(4):1159–1164, 2002) open the horizon for human interaction studies, allowing for the evaluation of interbrain connectivity. These techniques represent methods for simultaneously recording signals from different brains when subjects are interacting. In this chapter, we will explore the potentials of functional magnetic resonance imaging (fMRI) and functional near-infrared spectroscopy (fNIRS), which are techniques based on blood-oxygen-level-dependent (BOLD) signal. We will start with a brief explanation of the BOLD response basic principles and the mechanisms involved in fMRI and fNIRS measurements related to brain function. We will then discuss the foundation of the social brain, based on the frst studies, with one subject per data acquisition, to allow for understanding the new possibilities that hyperscanning techniques offer. Finally, we will focus on the scientifc literature reporting fMRI and fNIRS hyperscanning contribution to understand the social brain.

**Keywords** fMRI · fNIRS · Social brain · Hyperscanning · Whole-brain coverage

P. R. Bazán (\*)

E. Amaro Jr

LIM-44, Departamento de Radiologia, Hospital das Clínicas da Faculdade de Medicina da Universidade de São Paulo, São Paulo, Brazil

Hospital Israelita Albert Einstein, São Paulo, Brazil e-mail: paulo.bazan@usp.br; paulo.bazan@einstein.br

LIM-44, Departamento de Radiologia, Hospital das Clínicas da Faculdade de Medicina da Universidade de São Paulo, São Paulo, Brazil

#### **Introduction**

Recently, the "social brain" (i.e., how the brain works in social context and the mechanisms for our social behavior) has gained focus in neuroscience literature – largely due to the fact that recently developed techniques allow studying different aspects of human social cognition and its brain correlates. In this context, hyperscanning techniques (Montague et al., 2002) open the horizon for human interaction studies, allowing for the evaluation of interbrain connectivity. These techniques represent methods for simultaneously recording signals from different brains when subjects are interacting. In this chapter, we will explore the potentials of functional magnetic resonance imaging (fMRI) and functional near-infrared spectroscopy (fNIRS), which are techniques based on blood-oxygen-level-dependent (BOLD) signal. We will start with a brief explanation of the BOLD response basic principles and the mechanisms involved in fMRI and fNIRS measurements related to brain function. We will then discuss the foundation of the social brain, based on the frst studies, with one subject per data acquisition, to allow for understanding the new possibilities that hyperscanning techniques offer. Finally, we will focus on the scientifc literature reporting fMRI and fNIRS hyperscanning contribution to understand the social brain.

#### **Hemodynamic Response and BOLD Signal**

The relationship between neuronal activity and hemodynamic response is the basis underlying the blood-oxygen-level-dependent (BOLD) signal. One of the frst models based on biomechanical properties related to blood volume, blood fow, and oxygen consumption was proposed by Buxton et al. (1998): the balloon model. This construct is a valid point to address how hemodynamics is related to neural function. Although local increase in metabolism related to neuronal activity increases the consumption of oxygen, the increase in supply of oxygen is higher than required by energy consumption needs. This is due to the fact that the transport from intravascular (hemoglobin linked) O2 to intraneuronal space depends on passive mechanisms related to differences in pressure gradients. We have also to add to that fact the increased blood volume and fow in the dynamics of the process. This equilibrium evolves over time and generates a small initial decrease in oxyhemoglobin/ deoxyhemoglobin proportion, followed by a strong increase of this ratio. When the temporal dynamics are considered in this equation – and also based on the observations – the BOLD response associated with neuronal activity is a slow response that reaches its peak around 6 s after a stimulus. Moreover, this neurovascular coupling is associated with several pathways, including neuronal release of vasoactive mediators (e.g., nitric oxide), and pathways related to calcium activity in astrocytes (for reviews, read Longden et al. (2016) and Filosa et al. (2016)). Thus, the BOLD response evaluated with fMRI and fNIRS is an indirect measure of neural activity.

#### **How Functional Magnetic Resonance Imaging Detects BOLD**

The blood-oxygen-level-dependent (BOLD) signal was detected in MRI by Ogawa et al. (1990) and was frst described in human brains in 1992 (Bandettini et al., 1992; Kwong et al., 1992; Ogawa et al., 1992). For a detailed history of the development of fMRI, we recommend the review article by Bandettini (2012), which celebrates 20 years of the technique. The detection of BOLD signal was possible because the hemoglobin has different magnetic properties when it is oxygenated, due to conformation changes related with the iron-oxygen binding site. While oxyhemoglobin is diamagnetic (low interaction with magnetic induction), deoxyhemoglobin is paramagnetic and generates a distortion of the magnetic feld around it. A distortion in the magnetic feld causes a faster decrease in the hydrogen nuclear resonance signal. Higher concentrations of oxyhemoglobin relative to deoxyhemoglobin allow a more stable local magnetic feld and therefore more signal. Hence, fMRI is a technique that measures relative signals based on the oxyhemoglobin and deoxyhemoglobin proportion. This highlights the importance of baseline control conditions for fMRI experiments as it is not an absolute measure; it is a relative measure. fMRI has a whole-brain coverage with good spatial resolution (in the order of millimeters, and can reach submillimetric resolution using ultrahigh feld MR systems), however a relatively lower temporal resolution (in the order of seconds, mainly due to the slow temporal hemodynamic response, although MR systems are capable of acquiring data in the order of hundreds of milliseconds). The precision of MRI is related to gradients generated in the magnetic feld, which alter the specifc radio frequency absorbed by hydrogen nucleus. Therefore, it is highly sensitive to movement and requires participants to lay down inside the scanner. During fMRI, volunteers enter the scanner bore, a tunnel large enough to host a human body – and as such are not a natural environment, but rather may induce claustrophobia – and image acquisition depends on a head coil to detect the resonance signals. Also, there are several restrictions or exclusion criteria for participating in fMRI experiment, due to the intense magnetic feld, such as pregnancy, pacemakers, magnetic prosthesis, tattoos (depending on the pigment used), and other situations that might induce risk for the participant.

#### **How Functional Near-Infrared Spectroscopy Detects BOLD**

The BOLD signal in the human brain was detected using fNIRS around the same period it was observed using fMRI (Hoshi & Tamura, 1993; Chance et al., 1993; Kato et al., 1993; Villringer et al., 1993). Therefore, similar celebrative reviews detail the history of fNIRS development (Ferrari & Quaresima, 2012; Scholkmann et al., 2014). As mentioned above, the oxygen bond to hemoglobin causes a conformational change which alters electromagnetic properties of the molecule. fNIRS depends on changes in light abortion in different near-infrared wavelengths related to oxygen bond to hemoglobin. Though different absorption rates of oxyhemoglobin and deoxyhemoglobin are observed in several parts of the electromagnetic light spectrum, near-infrared light is less absorbed by the skull and other tissues between the cortex and the scalp. In this way, fNIRS can be used to evaluate separately the cortical concentration of oxyhemoglobin and deoxyhemoglobin. For these measures, fNIRS uses optodes (similar to electrodes but with optical properties) as sources of light and as detectors of the light that is scattered through the brain tissue. A combination of source and detector forms an fNIRS channel (one source and four detectors could form four channels) located between the optodes. Duo to light scattering and absorption, fNIRS can only detect signal a few centimeters below the scalp, providing mainly cortical signal. It is also important to notice that tissue transparency to light depends on age, in a way that the skull in babies is more transparent than in adults. The recommended distance between optodes for adults is between 2.5 and 4 cm. Positioning optodes closer together (e.g., 0.8 cm) can be used to detect and later flter hemodynamic processes unrelated to local brain activity (Brigadoi & Cooper, 2015). Usually, these optodes are attached to a cap that follows the 10–20 coordinate system (and its variations) of electroencephalography (Jasper, 1958; Oostenveld & Praamstra, 2001). As an advantage, fNIRS can be portable and is less affected by movement, being more suitable for ecological and naturalistic studies. Also, since the volunteers can move to a certain degree, it is more suitable for studies with babies and young children. The temporal resolution of fNIRS systems depends on the number of sources used, since each source has to be turned on separately to avoid mixing signals from different regions in the detectors. Usually, the sampling rate can vary around 4 Hz to 60 Hz, providing fNIRS with higher temporal resolution than fMRI. This is useful for correction of cardiac artifacts, given the higher sampling rate diminishes aliasing artifacts, which are present in fMRI data. On the other hand, fNIRS has lower spatial resolution (in the order of centimeters) and does not have a whole-brain coverage (restricted to cortical signal), and the number of optodes available defnes the cortical coverage level of the system.

#### **Types of Experiment Design and Data Analysis for BOLD Studies**

In task-based designs, fMRI and fNIRS can be used to identify regions with BOLD signal variation related to task variation (from a baseline control condition to the task of interest), based on BOLD response after a stimulus (event-related design), or due to a block of stimulus (block design). In these types of design, it is important to consider stimulus (or task) sequence, duration, number of repetitions, time between stimulus, and the hypothesis of which brain regions will be related to the task in order to defne the design that will provide more statistical power for a general linear model analysis (the most common analysis in this context). For a review of study design, we suggest Amaro Jr and Barker (2006).

Alternatively, there are designs in which the subject sustains a brain state either by continuously performing a specifc task or by remaining in resting state (with no specifc task, only with the instruction to remain awake and not focus in anything in particular). These designs are used in connectivity studies, which explore signal relation between different brain regions and explore brain organization and communication between areas. For example, resting-state studies allowed the identifcation of intrinsic brain networks (Greicius et al., 2003; Fox et al., 2005; Damoiseaux et al., 2006). There are several connectivity measures, and they can be applied both to resting-state (Han et al., 2018) and to task-based (as event-related and block) studies as well (Friston, 2011). Some measures are data-driven, like independent component analysis, while others depend on previous hypothesis-driven models, as in the case of dynamic causal modeling. Even more simple calculations, as correlation index, can be used to study organization of brain networks. Graph theory can be applied using these connectivity measures to evaluate the characteristics of these networks.

#### *Hyperscanning Design and Data Analysis*

The term hyperscanning was frst used by Montague et al. (2002) referring to measuring brain signal from interacting humans. In their study, two synchronized fMRI scanners were used to measure brain activity while pairs of volunteers interacted in a competitive game. Since then, hyperscanning has been performed with several techniques, such as electroencephalography (EEG), magnetoencephalography (MEG), fMRI, and fNIRS (Babiloni & Astolf, 2014; Zhdanov et al., 2015; Wang et al., 2018). Different from acquiring data from subjects separately, hyperscanning offers the possibility of relating brain activities from different subjects preserving trial-specifc characteristics. In other words, even though having a pair of volunteers perform the same task twice while measuring one subject at a time would provide a way of comparing brain activities, the data from each subject would have different specifc trial characteristics as performance score or event-specifc strategy or brain state during the trial; on the other hand, with hyperscanning, these are preserved, providing more information on brain signals during interaction. Also, hyperscanning can be used to expand the concepts of connectivity analysis from within to between brains, revealing more than what areas have more signal during interaction but also how these areas from different brains coordinate their activity.

There are some important technical aspects to consider when doing hyperscanning:

• Synchrony between equipment. In order to be able to take advantage of simultaneous recording, it is necessary to have precise synchrony of recordings from the different volunteers. As an example, if there is asynchrony or a lag between recordings, then correlation measures would be shifted; in case of causation methods, this could signifcantly alter the interpretation of leader and follower. It is also important to notice that sampling rate and the type of signal measured directly infuence the required precision of synchrony. In fMRI, usually with whole-brain sampling rate of 2 s (and around 50 ms per slice), a lag of 50 ms (one slice) might be acceptable, also taking into account the slow hemodynamic response; in an EEG setup with 1000 Hz sampling rate measuring fast electric changes, 50 ms would represent a lag of 50 data points, and would not be acceptable. It is also important to notice that if the lag is constant, it can be corrected during analysis by shifting the time series appropriately. However, if the lag is variable and with no clear pattern, then it probably will not be correctable. Intranet- and Internet-based synchrony solutions were developed to allow for triggering MRI scanners in different buildings (Montague et al., 2002). Also, there are software solutions such as lab streaming layer (https://github.com/sccn/ labstreaminglayer; Gramann et al., 2014; Ojeda et al., 2014), which allows for synchronizing fNIRS and other types of signals received by a computer. Another option would be to use only one system to get the data from two or more subjects (see Fig. 14.1 for hyperscanning options).


**Fig. 14.1** Hyperscanning options: (**a**) using one data acquisition device for each subject and using synchrony devices to assure synchronized data acquisition; (**b**) sharing the same data acquisition device for all the subjects and providing synchrony between data from all subjects but limiting distance between subjects and reducing the number of channels (fNIRS). For schematic representation, fNIRS was presented in the image, but similar considerations are valid for fMRI hyperscanning. *LSL* lab streaming layer

digms (Wang et al., 2015), are good examples of interaction paradigms. We will further explore brain mechanism in some of these types of interaction in the section called "Hyperscanning Studies of the Social Brain."

• Having a good control condition. Since one of the main goals in hyperscanning is to measure interbrain connectivity and relating it to a specifc task or condition, a proper control condition is required. For example, two volunteers receiving the same stimuli or performing the same task (e.g., watching a movie), with the same temporal structure, would have a level of synchrony between their brains,

even if these data were recorded one subject at a time (Hasson et al., 2004). As we will discuss below, some studies used this property to study the social brain, recording one participant at a time, but repeating the social task, using recorded videos from one session in the other (Schippers et al., 2009; Lee et al., 2018). To flter this signal similarities or to have an adequate baseline condition, the control task should have all the same cognitive elements and with the same intensity as the main task, except for the social aspect on which the research is focusing.

#### **The Social Brain: Empathy, Theory of Mind, and Mirror Neuron System**

This section will provide some key theories about brain mechanism relevant for social tasks. For this, the following text will concentrate on the frst approaches used to unveil the brain mechanisms related to social interactions and build the ground for understanding the advantages that hyperscanning techniques offer for studying the social brain, which will be covered in the next section.

The idea of the social brain refers to the brain mechanism related to social skills, and cognitive processes important for social interaction, as motor coordination (joint action or imitation), affective empathy, and theory of mind (ToM – also called mentalizing). ToM refers to the capacity of understanding perspectives and beliefs from other people (and maybe animals) or, in other words, to the ability of attributing a mental state to others and to oneself as well. The frst study to use the term "theory of mind" was evaluating the possibility of ToM in chimpanzees (Premack & Woodruff, 1978). Later, using positron emission tomography, Fletcher et al. (1995) studied theory of mind in humans, related to stories that had a mind state explanation for a character action compared to simply physical causality stories, fnding increased activity in left medial frontal gyrus, anterior cingulate gyrus, and posterior cingulate gyrus. The same idea of stories but with different modalities was also used by Gallagher et al. (2000), identifying increased response in medial prefrontal cortex and temporoparietal junction (TPJ). The TPJ is now one of the main regions associated with the social brain, as the medial prefrontal cortex is related to ToM (Frith & Frith, 2003). Based on these results, fNIRS studies usually focus on TPJ and prefrontal regions in the context of ToM, since most fNIRS studies do not have enough optodes for a whole-head coverage. In adults, it was proposed that even without specifc instruction (in a video story task), TPJ is engaged and related to spontaneous detection of others' beliefs (Hyde et al., 2015). Also, the emotional state of the participant seems to affect lateral prefrontal cortex activity during ToM director task, in which the volunteer has to consider the perspective of another person (Himichi et al., 2015). However, evaluation of cognitive empathy and affective empathy in children is very challenging. In these cases, fNIRS inherent characteristics enable specifc designs in children. For instance, a study using cartoon stories and verbal stories in 4–8-year-old children showed involvement of medial

orbitofrontal regions, as well as dorsolateral prefrontal cortex in these tasks (Brink et al., 2011). In addition, other authors observed greater TPJ relationship with belief detection compared to desire intention in 6–10-year-old children (Bowman et al., 2015).

Another part of the social brain is identifying motor intentions and coordinating motor action (as in imitation). Evaluating mechanisms of imitation with fMRI, the inferior frontal gyrus and the superior parietal lobule were found to be related both to performing a fnger-tapping and to observing the fnger-tapping, and even more intense activity was found during imitation of movement (Iacoboni et al., 1999). These results were later associated with the mirror neuron system (MSN, Grèzes et al., 2003; Iacoboni et al., 2005; Jeon & Lee, 2018 for a review), which is related to internal representation of motor intentions from others and is formed by inferior frontal gyrus, inferior and superior parietal lobule, borders of superior temporal sulcus, and premotor cortex. The mirror neuron system was also identifed with fNIRS in a more naturalistic table-setting task, which involved observation and execution (Sun et al., 2018). The mirror neural system is usually associated with mentalizing, as simulation-based system (Gallese & Goldman, 1998; Frith & Frith, 2006; Mahy et al., 2014), but there is still current debate to defne the specifc function and limit of each system in each social context and how these systems interact. Canessa et al. (2012) studied human subjects observing pictures with cooperative context (two persons caring an object) and pictures with affective context (two persons holding hands) using fMRI. These authors observed that both conditions seemed to engage the temporoparietal junction, but the ventromedial prefrontal cortex was more related to affective scenes, while inferior frontal gyrus and inferior parietal lobule were more associated with cooperative images. Moreover, the signal in these regions was related to empathy scores of participants. In a posterior study, the effective connectivity of these regions was evaluated indicating signifcant connectivity in these regions, but with different directionality according to the type of picture (Arioli et al., 2018).

Receiving and processing social feedback are also another important part of the social brain. In an fMRI study, the brain regions related to social infuence on rating of emotional images were evaluated showing participation of the prefrontal cortex, borders of superior temporal sulcus, amygdala, and insula (Lin et al., 2018). In this study, participants rated images before and after seeing the average rating given by (simulated) group members, to check the social infuence and induced rating adjustment. Another fMRI study evaluated positive and negative feedback of personal traits simulating a hyperscanning competitive group situation, but the feedbacks were predefned by the experimenter, and only one subject was actually being evaluated at a time (Dalgleish et al., 2017). In these experiments, an increased BOLD signal was observed in the ventromedial prefrontal cortex related to positive feedback, while both positive and negative feedback elicited responses in the anterior cingulate cortex and amygdala.

The aforementioned studies had a similar design in the sense that they used single subject tasks to study specifc cognitive function relevant for interaction (Fig. 14.2a). However, this method has limitations, as it does not measure the brain signals during an actual interaction. As a frst option to overcome this limitation, it is possible to evaluate the effect of an actual interaction on the brain activity of a subject (Fig. 14.2b). For example, Chauvigné et al. (2018) compared brain activity of professional dancers in different types of hand interactions (in a type of joint action), comparing leading and following brain activities and fnding activity in the inferior frontal gyrus and premotor cortex during leading and in the ventromedial prefrontal cortex and borders of superior temporal sulcus in following. Rauchbauer et al. (2019) compared brain activity of subjects during conversation with another human to the activity during conversation with robots and observed higher temporal activity in human-human interactions. These results may indicate a starting point to understand human social cognition, as well as the social competence of robots interacting with humans.

Some studies also tried to compare the brain activity of participants that were scanned in different sessions, by recording a video of the frst session and using it in the second session with the other volunteer. Lee et al. (2018) evaluated mother-child brain signal similarities during stress condition for the adolescent (the video of the child was recorded, and the mother was scanned observing the video of the adolescent during this stress condition), fnding family relationship level impacting similarity in the insula and anterior cingulate cortex. Schippers et al. (2009) also used this video recording technique and found mirror neuron system and TPJ activity during decoding of gestures in a charades game.

Although the scientifc evidence covered in the above paragraphs provided the basis for main theories proposed to explain social interaction, they were based on experimental designs unable to probe the relationship between neural systems during dynamic interpersonal interaction. This is necessary to understand how interacting brains regulate their function based on the other person's behavioral responses.

#### **Hyperscanning Studies of the Social Brain**

Key aspects of correlation between brain signals and also brain activity in real interactions can be better evaluated in hyperscanning setups (Fig. 14.2, panel c). A review of hyperscanning with fNIRS reported 20 fMRI and 7 fNIRS hyperscanning studies published up to spring of 2013 (Scholkmann et al., 2013). Based on PubMed and Web of Science search using keywords Hyperscanning and fMRI and Hyperscanning and fNIRS, we found 14 fMRI hyperscanning research articles from the beginning of 2014 to March of 2019 and 33 fNIRS hyperscanning articles, showing the increased applications of these techniques, specially fNIRS hyperscanning. It is important to mention that our literature search found a total of 23 fMRI (9 up to spring of 2013) studies and 40 fNIRS studies, indicating that the search parameters used by Scholkmann were different from ours – perhaps not only due to time differences – especially regarding fMRI hyperscanning. The following subsections will explore hyperscanning experiments with each technique.

**Fig. 14.2** Types of social brain experiments and analysis possibilities. (**a**) Single subject brain signal acquisition; single subject task. (**b**) Single subject brain signal acquisition; multi-subject task. (**c**) Multi-subject brain signal acquisition; multi-subject task

#### *fMRI Hyperscanning*

The frst study using the term hyperscanning was performed using two 1.5 T MRI scanners synchronized by one server through the Internet, with latencies below 300–400 ms, to show the feasibility of hyperscanning studies. After that, the technique was adopted to explore different aspects of social interaction in combination of 1.5 and 3 T scanners (Saito et al., 2010; Krill & Platek, 2012; Fliessbach et al., 2012; Tanabe et al., 2012; Spiegelhalder et al., 2014; Stolk et al., 2014), two or more 3 T scanners (Tomlin et al., 2013; Morita et al., 2014; Trees et al., 2014; Bilek et al., 2015, 2017; Koike et al., 2016, 2019; Shaw et al., 2018; Špiláková et al., 2019; Abe et al., 2019), and also combining 3 T and 7 T scanners (Baecke et al., 2015). Alternatively, Ray Lee and colleagues proposed performing hyperscanning with a dual-head coil designed for hyperscanning studies (Lee et al., 2010, 2012; Lee, 2015a, b). This would provide a good solution for synchrony issues and sequence parameters. As a drawback, an MRI system is already quite small for a single subject, so sharing this little space inside the scanner with another person can be uncomfortable. Other examples of hyperscanning implementation include virtual reality (Trees et al., 2014) and using a brain-computer interface system (Baecke et al., 2015).

Although there are several variations in fMRI hyperscanning techniques, one of the most mentioned in the literature is the eye-cued joint action. In this approach, one of the volunteers has to guide his gaze by the gaze of the other participant. This simple task circumvents some diffculties faced by other interaction tasks in fMRI environment, mostly related to restrictions of body movement. These studies found higher activity during eye gaze cued tasks in the inferior frontal gyrus, occipital cortex, anterior cingulate gyrus/medial prefrontal cortex, temporal cortex, and borders of the superior temporal sulcus (Saito et al., 2010; Tanabe et al., 2012). Moreover, higher interbrain cross correlation, after fltering task effects, was found in the right inferior frontal gyrus (Saito et al., 2010), and this connectivity was diminished in pairs in which one of the volunteers had autism (Tanabe et al., 2012). Interbrain connectivity in the inferior frontal gyrus was also observed during simple mutual gaze, either days after a joint action task (Koike et al., 2016) or without joint attention task execution, but closer to the insular cortex in this case (Koike et al., 2019). With a different approach during eye-cued joint attention mutual gaze, using independent component analysis and evaluating the relation between components involved in the task from each volunteer of the pair, higher interbrain connectivity was detected in the right temporoparietal junction (Bilek et al., 2015). The normal connectivity pattern was also disrupted in patients with borderline personality disorder (Bilek et al., 2017). These examples illustrate the potential of hyperscanning in the context of disorders that affect social interaction (Ray et al., 2017).

Several other social interaction paradigms were evaluated with fMRI hyperscanning, such as joint force, used to study cooperation and motor coordination (Abe et al., 2019), fnding higher interbrain connectivity in the right temporoparietal junction, and right temporoparietal junction signal increase during task was related to performance scores. Both Spiegelhalder et al. (2014) and Stolk et al. (2014) have evaluated verbal and signal communication (respectively) and found a similar result: brain signal synchrony in temporal lobes either to areas related to talking, in verbal communication, or to the pair temporal lobes, in case of signal communication.

Game theory tasks are also used in hyperscanning experiments. For instance, the ultimatum game is a turn-based game in which a proposer chooses a proportion of reward distribution between participants and then the responder has to decide either to accept the proposal, and the reward is distributed between participants as agreed, or to reject it, in which case none of the participants receive a reward. Using this type of paradigm, Fliessbach et al. (2012) found striatum and ventromedial prefrontal cortex activity in both participants in more generous proposals and in higher level of acceptance by responders. Another interesting fnding is the anterior/middle cingulate cortex interbrain connectivity, which was correlated to reciprocity of the proposer, possibly related to judgment of proposals in situations with advantageous and disadvantageous inequities (Shaw et al., 2018). An important mechanism involved in the social brain decision is the reward mechanism, which seems to be associated with increased BOLD signal when the pair worked together to complete a task, as indicated in a study that used a maze task in which one of the participants saw the maze and gave instructors to the other, which had to drive through the maze without directly seeing it (Krill & Platek, 2012). In a group hyperscanning with groups of fve participants at a time, the effect of social infuence (provided by feedback about other participants' decision) on decision indicated that insular response was higher when individual decisions were different from other participants and also predicted the tendency of realignment to the group in the next decision. This insular activity could be related to embarrassment, as suggested by a study that evaluated self-face recognition while being observed by others, which also detected intra-brain connectivity between the anterior cingulate cortex and the dorsomedial prefrontal cortex when being observed (Morita et al., 2014).

As previously mentioned, social interaction tasks can have different structures related to temporal structure of the task, goal of task, and dependency of actions. These can evoke different neural systems, as evaluated by Špiláková et al. (2019). They studied goal and temporal structure effects with a pattern game in which a builder player had to form a pattern with disks in a virtual game table, while other participants could cooperate or try to avoid reaching the pattern (goal effect). In one condition, the players responded simultaneously, while in the other the builder started the turn-based interaction. Higher BOLD signal was found during cooperative tasks in the ventromedial prefrontal cortex, superior and middle temporal gyri, and orbitofrontal cortex, while competitive tasks were related to the dorsolateral prefrontal cortex, supplementary motor area, insula, and cerebellum. Also, simultaneous tasks had higher activity in the temporal lobe, insula, and motor areas. These highlight the importance of choosing the appropriate paradigm according to the desired interaction mechanism to be studied.

Together, these studies exemplify the possibilities of fMRI hyperscanning, using its high spatial resolution to try to identify specifc roles for areas within a system. On the other hand, fMRI offers several restrictions for experiment design, which limit more naturalistic tasks.

#### *fNIRS Hyperscanning*

The frst hyperscanning study using fNIRS technique involved joint action (Funane et al., 2011) in a task in which participants have to synchronize their motor response (usually a button press) after a stimulus. This task and small variations of it are probably the most used paradigms in hyperscanning fNIRS studies because they offer a simple model of interaction. The control condition in these studies usually is a competitive condition, in which the volunteers compete for the faster response after a signal to perform the movement. Using this kind of paradigm studies found increased prefrontal connectivity between brains associated with better performance (Funane et al., 2011) and similar results in the superior frontal cortex (Cui et al., 2012). The coherence between pairs in the right superior frontal cortex was affected but level of intimacy, when comparing lovers, friends, and stranger pairs, with lovers presenting higher coherence and better performance (Pan et al., 2017). Also, the gender of the pairs seems to affect interbrain connectivity and performance, with male-male pairs having better synchrony performance, although contrasting opposed interbrain connectivity results were found in different studies (Cheng et al., 2015; Baker et al., 2016). Interestingly, controlling the feedback after trials and runs could modulate performance and interbrain connectivity, with better performance associated with higher coherence in the superior prefrontal cortex and dorsolateral prefrontal cortex (Cui et al., 2012; Balconi et al., 2018). This effect was also evaluated from mother child dyads, also fnding right dorsolateral prefrontal cortex interbrain connectivity related to joint action (Reindl et al., 2018), and different brain mechanisms might be associated with child gender (Miller et al., 2019). Simulated positive feedback could also alter connectivity and performance even in a competitive task (Balconi & Vanutelli, 2017).

A small variation of this task is the joint fnger-tapping, in which participants have to synchronize fnger-tapping movements, with the control conditions being metronome synchronization. Studies found increased interbrain connectivity in the right prefrontal cortex (Dai et al., 2018) and in the medial prefrontal cortex, when performing source-based analysis (Zhao et al., 2017). Adding an imitation component to the task with a defnition of leader and follower to determine the fngertapping rhythm, higher granger causality from leader to follower was detected in the left premotor cortex (Holper et al., 2012). Prefrontal cortex cortex involvement as in the studies mentioned above was observed by using the joint n-back task, a different type of joint task in which pairs performed dual n-back task (each participant in charge of a n-back) together (Dommer et al., 2012). These studies highlight the possibilities of joint action interaction and fnger-tapping tasks, although they might be too associated with motor mechanisms, and therefore other types of tasks could be important to further explore the social brain.

A game theory paradigm with the ultimatum game was also evaluated with fNIRS hyperscanning indicating higher coherence between right temporoparietal junctions when performing the task face-to-face (Tang et al., 2016), probably associated with higher ToM. This is in agreement with studies with eye contact by itself, which can alter interbrain coherence in temporal regions (Hirsch et al., 2017). Faceto-face interaction also has an impact in communication mechanisms, presenting higher hyperconnectivity (interbrain connectivity) in the left inferior frontal cortex than during back-to-back dialogues and even than face-to-face monologues (Jiang et al., 2012). In a group communication, left temporoparietal junction hyperconnectivity distinguished a leader-follower pair, as opposed to two followers during group communication. In a four-group word game, synchrony was observed in frontopolar regions (Nozawa et al., 2016). There are several explanations for the difference in location of the synchrony since the studies used different fNIRS optode positioning and the communication tasks were different, as well as artifact handling during analysis.

In turn-based game interactions, poker game adaptation indicated the importance of temporoparietal junction for ToM when comparing human-human against human-computer competitions (Piva et al., 2017). TPJ was also related to higherrisk decision, which is assumed to engage more mentalizing due to more careful evaluation of the opponent (Zhang et al., 2017). This study further suggested there might be a gender difference in high-risk situations, with women presenting higher TPJ hyperconnectivity than men. Pattern game studies found higher mentalizing during competition and therefore increased hyperconnectivity in the inferior parietal lobule and further different inferior frontal gyrus participation, while borders of the superior temporal sulcus showed hyperconnectivity during both competitive and cooperative games (Liu et al., 2015, 2017). On the other hand, comparing obstructive and cooperative Jenga game indicated more hyperconnectivity in the right dorsolateral prefrontal cortex during cooperation (Liu et al., 2016). Therefore, competitive characteristics in some task-based games are completely different from competition control conditions in joint action or in the Jenga obstruction example, due to different strategies and different mentalizing requirement in each type of competition. These results suggest that rather than unifed cooperation and competition mechanisms, different mirror neuron system, empathy, and ToM mechanisms might mediate cooperation and competition according to specifc task design.

Working together to solve problems is a common situation in social interaction, and it is related to creativity. Studies with realistic resented problem task, in which participants have to provide as many solutions as possible to solve a realistic problem, showed higher hyperconnectivity in the dorsolateral prefrontal cortex and in the right temporoparietal junction associated with higher cooperation, either comparing to a competitive task or when evaluating creativity levels of dyads and also in the context of an experimenter acting as a participant and providing feedback for the ideas proposed by participants (Xue et al., 2018; Lu et al., 2018, 2019; Lu & Hao, 2019). Interestingly, when comparing creativity levels between pairs, low creativity individuals formed effcient dyads in cooperation with high interpersonal neural synchrony and good performance in the task, while other pairs with at least one participant classifed as creative did not show the same cooperation and connectivity. Also, positive feedback enhanced interaction and cooperation, and negative feedback seemed to disrupt interaction.

Considering other close to real-life tasks, teacher and student interactions were evaluated showing higher interbrain connectivity in the left prefrontal cortex when information was transferred from teacher to student (Holper et al., 2013). Also, a feasibility study suggested that it is possible to perform the teacher-student experiment with a child student, fnding student's prefrontal signal related to teacher's TPJ (Brockington et al., 2018). The same article explored the possibility of recording data from four students in a class and showed hyperconnectivity between students when they were paying attention to the teacher. Last, they showed that it is possible to perform these measures also combining with eye-tracking information. In another example of a real-life situation that can be studied with fNIRS hyperscanning, interbrain connectivity was evaluated between client and counselor during psychological counseling, indicating hyperconnectivity in the right TPJ compared to a chatting control situation, possibly related to required mentalizing for psychological counseling (Zhang et al., 2018).

Music is also highly related to social interactions and therefore could be a good task for fNIRS studies. A song-learning task found increased hyperconnectivity in the inferior frontal cortex, with directionality from teacher to learner (Pan et al., 2018). Similar results were observed related to cooperative singing and humming, regardless of face-to-face or non-face-to-face execution of the task (Osaka et al., 2014, 2015). Also, the right inferior frontal cortex seemed to be more associated with humming. Pairs of violinists in a leader-follower context presented higher temporoparietal junction and somatomotor signal when playing as follower (Vanzella et al., 2019), and a feasibility study suggested hyperconnectivity between violinists (Balardin et al., 2017). Another feasibility study suggested that multibrain hyperscanning could be represented as a multibrain network and evaluated as a graph and as feasibility proposed a data collection in nine participants simultaneously drumming (Duan et al., 2015).

A different approach to fNIRS hyperscanning was proposed by Duan et al. (2013). In their study, the authors have designed a neurofeedback platform. Neurofeedback has been studied as a tool for improvement of cognitive functions, especially in the context of brain disorders, though there is still debate about the effcacy and proper use of neurofeedback (Thibault et al., 2016; Kadosh & Staunton, 2019). Therefore, hyperscanning neurofeedback might present an opportunity for investigations on enhancement of social abilities or treatments of disorders which affect the social brain, but these possibilities should be addressed carefully.

These studies highlight the advantage of fNIRS for naturalistic and closer to reallife tasks, given its tolerance to movement, and portability (in some systems). However, it is also important to notice that some disagreement in the fNIRS hyperscanning literature can come from different optode position, given that studies with systems which allow whole cortical coverage are scarce, and possibly can be due to smaller spatial resolution, which implies studying larger cortical areas as a single region, compared to the specifcity that fMRI studies can provide. Further, different preprocessing steps (mainly dealing with noise) could affect hyperconnectivity results, and there is still search for better analysis processes.

#### **Conclusion and Future Perspectives**

We have discussed the possibilities fMRI and fNIRS offer for studying the social brain. While fMRI provides high-resolution whole-brain coverage, fNIRS offers a great opportunity for real-life task studies. Studies recording one subject can help elucidate basic mechanisms that are engaged and combined during real interaction, as they offer easier possibilities for isolating and controlling cognitive aspects of the experiment. Meanwhile, hyperscanning provides an integrated look that can unveil new interactions between basic mechanisms and possibly new mechanisms (or better models) for social interaction related to different contexts. Moreover, hyperscanning can offer a new perspective in search for biomarkers and in understanding diseases and disorders that affect the social brain (Ray et al., 2017). Further studies should also focus on combining fMRI and fNRIS techniques such as EEG, eyetracking, and possibly even MEG, to provide more information and higher temporal resolution, preserving spatial resolution, although newer analysis methods and hypothesis-driven models are also required for better use of the data (Koike et al., 2015). Another potentially interesting combination for fNIRS and fMRI hyperscanning are autonomic response measures, since there seems to be autonomic coupling during cooperation (Vanutelli et al., 2017). Controlling these autonomic responses in hemodynamic response-based systems could help interpret the data (Kadosh & Staunton, 2019).

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 15 Modulating the Social and Affective Brain with Transcranial Stimulation Techniques**

**Gabriel Rego, Lucas Murrins Marques , Marília Lira da Silveira Coêlho , and Paulo Sérgio Boggio**

**Abstract** Transcranial brain stimulation (TBS) is a term that denotes different noninvasive techniques which aim to modulate brain cortical activity through an external source, usually an electric or magnetic one. Currently, there are several techniques categorized as TBS. However, two are more used for scientifc research, the transcranial magnetic stimulation (TMS) and the transcranial direct current stimulation (tDCS), which stimulate brain areas with a high-intensity magnetic feld or a weak electric current on the scalp, respectively. They represent an enormous contribution to behavioral, cognitive, and social neuroscience since they reveal how delimited brain cortical areas contribute to some behavior or cognition. They have also been proposed as a feasible tool in the clinical setting since they can modulate abnormal cognition or behavior due to brain activity modulation. This chapter will present the standard methods of transcranial stimulation, their contributions to social and affective neuroscience through a few main topics, and the studies that adopted those techniques, also summing their fndings.

**Keywords** Transcranial magnetic stimulation · Transcranial direct current stimulation · TMS · tDCS

L. M. Marques Instituto de Medicina Fisica e Reabilitacao, Hospital das Clinicas HCFMUSP, Faculdade de Medicina, Universidade de Sao Paulo, Sao Paulo, Brazil

G. Rego · M. L. da Silveira Coêlho · P. S. Boggio (\*)

Social and Cognitive Neuroscience Laboratory, Developmental Disorders Program, Center for Health and Biological Sciences, Mackenzie Presbyterian University, São Paulo, Brazil e-mail: boggio@mackenzie.br

#### **Introduction**

Transcranial brain stimulation (TBS) is a term that denotes different noninvasive techniques which aim to modulate brain cortical activity through an external source, usually an electric or magnetic one. Currently, there are several techniques categorized as TBS. However, two are more used for scientifc research, the transcranial magnetic stimulation (TMS) and the transcranial direct current stimulation (tDCS), which stimulate brain areas with a high-intensity magnetic feld or a weak electric current on the scalp, respectively. They represent an enormous contribution to behavioral, cognitive, and social neuroscience since they reveal how delimited brain cortical areas contribute to some behavior or cognition. They have also been proposed as a feasible tool in the clinical setting since they can modulate abnormal cognition or behavior due to brain activity modulation. This chapter will present the standard methods of transcranial stimulation, their contributions to social and affective neuroscience through a few main topics, and the studies that adopted those techniques, also summing their fndings.

#### **Essentials of Transcranial Electrical and Magnetic Stimulation**

Transcranial magnetic stimulation (TMS) frst appeared in 1985, at the beginning, adopted to investigate nervous propagation along the corticospinal tract and peripheral nerves (Rossini & Rossi, 2007) and investigate the brain function excitability of different brain areas (Hallet, 2007). It consists of a coil and one or two generators (also called stimulators), which generate current pulses converted on the coil in a magnetic feld. When positioned on the scalp, the coil delivers a magnetic pulse that creates a transient electric feld in cortical areas underneath, activating neural networks through axonal depolarization or impairing neural activity through postexcitatory inhibition, i.e., "silent period" (Chen et al., 1999; Lefaucheur et al., 2014). It is possible to apply single or paired (e.g., double or triple pulses) TMS to investigate intracortical circuits and their relation to behavior and cognition (Ni et al., 2011). In addition to these procedures, there is also possible to use repetitive TMS (rTMS) to excite or inhibit a cortical area depending on the parameters adopted, mainly the frequency of pulses delivered. Studies with the motor cortex established that low-frequency stimulation (≤1 Hz) is usually inhibitory while highfrequency stimulation (≥5 Hz) is excitatory, but a variation in these effects can occur due to differences in intensity and duration of rTMS. It is also important to highlight those differences in effect depend on other parameters such as type of the coil, distance and orientation to the head and the waveform, intensity, and frequency of magnetic pulse (Lefaucheur et al., 2014).

Another common neuromodulator adopted in neuroscience similar to the TMS is the transcranial direct current stimulation (tDCS). An initial version of the tDCS (named "medical battery") appeared during the nineteenth century to treat several ailments. Nevertheless, only at the beginning of the twenty-frst century were its mechanisms deeply investigated and have been broadly adopted as a research tool (Wexler, 2017). tDCS consists of a low-intensity direct current (about 1 to 3 mA) applied on the brain by positioning two or more electrodes onto the scalp, forming an electrical circuit. While the minimum is two electrodes to close the circuit, it is possible to fnd assembles with more electrodes, just as in high-defnition tDCS. The electrodes vary in format and size, usually ranging between 10 and 40 cm2 in a round or square format. As observed in studies investigating motor cortex excitability, the stimulation's typical effect is enhanced excitability in cortical areas below anodic and inhibition below the cathodic electrode. However, differences in these effects can occur in brain areas other than the motor cortex, like those related to higher cognitive processing, where a linear effect between intensity and cortical excitability or inhibition seems not to be the rule. Also, the effect can vary accordingly to (i) the montage adopted, (ii) the size and orientation of the electrodes, (iii) the intensity and duration of stimulation, (iv) individual characteristics (e.g., gender, age, anatomical differences), and (v) if tDCS is applied during an active state (performing an activity of interest) or in a resting state (Giordano et al., 2017; Sellaro et al., 2017).

Despite the similarities here presented, there are also apparent differences between TMS and tDCS. For instance, TMS stimulation is more focal since the cortical target is circumscribed to an area about 2 or 3 cm2 when using a coil (Lozano & Hallett, 2013). Conversely, the usual tDCS montage's cortical target is broader, but it is possible to target a narrow area employing high-defnition tDCS. Regarding the TMS, its stimulation is more intense than tDCS, so it is possible to interrupt neural activity or stimulate an action potential with TMS, while tDCS can only modulate ongoing activity. Nevertheless, since TMS has a higher intensity, there is also a risk of seizures not present in tDCS stimulation, although reports in the scientifc literature indicate it is rare (less than 1 seizure per 60,000 sessions) if safety guidelines are adopted (Lerner et al., 2019). Another critical question to research and clinics is that tDCS is easier to apply than TMS because it has fewer parameters. Moreover, tDCS is considerably inexpensive compared to TMS.

It is also essential to present some relevant limitations to both techniques. First, such techniques are more focused when modulating cortical areas of the brain but are not widely used to stimulate subcortical areas. The stimulation of subcortical areas is usually indirect, employing a tDCS current passing those areas (yet with limited focus) or in response to some cortical region's stimulation by tDCS or TMS, such as stimulation of frontal areas to modulate the activity of subcortical areas as in the case of emotion regulation. TMS also has some coil models (e.g., H-coil, halo coil, or double-cone coil) that allow deep stimulation but also with less focus when compared to cortical targets.

Finally, concerning some practical aspects of using TBS, both techniques are usually applied prior ("offine") or concomitant ("online") to some cognitive or behavioral task. It is essential to consider safety aspects when using such techniques, such as avoiding applying such techniques in participants with epilepsy, metallic implants on the head, or pacemakers. Concerning tDCS, it is relevant to ascertain the skin's integrity where the electrodes will be applied; besides, some participants report reactions of severe discomfort and skin irritation. Here, we present only a few more superfcial aspects of both techniques. Bearing in mind that such techniques require different preparations and care, we recommend reading specifc articles on practical aspects in applying and preparing experiments for tDCS (Woods et al., 2016) and TMS (Hannula & Ilmoniemi, 2017). In the following topics, we will address the use of both techniques in social and affective neuroscience, as well as their main fndings.

#### **Social Neuroscience**

Social neuroscience is an interdisciplinary feld that aims to understand the neurobiology of social cognition and behavior in humans and animals – frst created from the merge of social psychology, neuroscience, and social sciences. It aims to investigate brain structures and their functioning on various social processes, such as communication, cooperation, empathy, moral judgment, prejudice, social learning, social decision-making, social perception, and so on (Cacioppo & Cacioppo, 2013; Lieberman, 2007). This section will present a few social neuroscience topics that adopted tDCS or TMS and demonstrate how these approaches clarifed the brain processes related to prejudice, social decision-making, and moral judgment.

#### *Prejudice*

Prejudice is the attitude toward others based on their group membership, and it is intrinsically related to affective and cognitive processes, such as social categorization and stereotyping (Amodio, 2014). Negative beliefs about the outgroup infuence choices, judgments, and behaviors (Sellaro et al., 2015) and can give rise to discrimination and prejudice to outgroup members (Amodio, 2014). In contrast, individuals judge more positively members of the same group when compared to another racial group, a phenomenon called ingroup favoritism (Taylor & Doria, 1981).

Different cerebral cortical regions are involved in prejudice, mainly associated with social perception and evaluation (Gamond et al., 2017). One of the primary brain areas associated with prejudice is the medial prefrontal cortex (MPFC), an area involved in several cognitive activities, such as social perception, categorization, stereotyping, and regulation/control of behavioral responses in social contexts (Amodio & Frith, 2006; Amodio, 2014; Sellaro et al., 2015). Sellaro et al. (2015) investigated the causal role of MPFC in stereotype neutralization using tDCS. In this study, participants performed the implicit racial attitude task (racial IAT) while submitted to a tDCS protocol (anodal, cathodal, or sham) of 1 mA intensity targeting the MPFC. Anodal stimulation decreased implicit bias when compared to cathodal stimulation or sham. Sellaro's study was the frst to demonstrate MPFC's causal role in cognitive control in overcoming negative judgment concerning another social group.

Another area associated with prejudice is the cerebellum. Recent studies have demonstrated the cerebellum's functional connectivity to the MPFC and other cortical regions like the temporoparietal junction during social judgments related to body reading, action sequencing, and mentalizing behavior (see Van Overwalle et al. 2015). One study by Gamond et al. (2017) evaluated the cerebellum and dorsomedial prefrontal cortex (dMPFC) roles in participants' implicit attitudes, where Caucasian participants had to categorize valence of positive/negative primed by ingroup or outgroup faces while receiving TMS. The behavioral experiment (without neuromodulation) showed ingroup bias with faster categorization for positive adjectives primed by the ingroup faces. However, both the dMPFC and the right cerebellum modulation interfered with this effect, preventing the ingroup bias. The results suggest that both brain areas play a causal role in social cognition processes, such as implicit social attitudes for ingroup members.

Finally, another study demonstrated TMS over dMPFC interfered with one's ability to discriminate emotions expressed by ingroup members. These fndings suggest a causal role of dMPFC in recognizing ingroup emotions (Gamond & Cattaneo, 2016). In summary, those studies have demonstrated, using tDCS and TMS, the crucial role of MPFC and related cortical areas (e.g., cerebellum) in social cognitive processes such as social group categorization and recognition.

#### *Social Decision-Making*

Social decision-making is a social neuroscience topic that aims to comprehend the neural mechanisms of choosing between alternatives in a social context (Sanfey, 2007). tDCS and TMS have been adopted in social decision-making to investigate brain areas' causal role (mainly prefrontal) during cooperation or competition situations simulated through simple games derived from behavioral economics. The selection of brain targets to modulate through tDCS or TMS is usually based on correlational studies previously conducted with neuroimage techniques, pointing to the probable involvement of a cortical area in some aspect of social decision-making.

Two main areas investigated through tDCS and TMS in social decision-making are the dorsolateral prefrontal cortex (DLPFC) and medial prefrontal cortex (MPFC). Several tDCS and TMS studies targeted DLPFC in social decision-making through ultimatum game (Knoch et al., 2006, 2008; Ruff et al., 2013), Trust Game (Knoch et al., 2009; Wang et al., 2016) and public goods game (Li et al., 2018; Liu et al., 2017). Overall, the fndings point to the right DLPFC role in implementing controlled cognition to identify contextual social norms or expectations and orient adaptive behavior to comply with those norms (Sanfey et al., 2014). One study investigated left DLPFC role in supporting people, showing that this area's enhanced excitability led to increased prosocial behavior. The authors hypothesized that this area could be related to the management of emotional information by controlled cognition (Balconi & Canavesio, 2014). Although these studies have been clearly showing the involvement of DLPFC in social decision-making, it is still not clear the specifc role of right and left areas in social decision-making.

Another area investigated in social decision-making is the MPFC, usually detected in neuroimage studies. One study with MPFC investigated its role on unfairness acceptance when unfair proposals were committed to oneself compared to a third party, showing that inhibition of this area led to a higher acceptance rate of unfair proposals and implying a causal role of MPFC in process fairness in situations involving self (Civai et al., 2014). In another study, Klucharev et al. (2011) evaluated the role of MPFC in social conforming on an attractiveness decision task, where participants should rate the attractiveness of models presented in photos. In this task, the downregulation of MFPC diminished social conformation, indicating that this area is related to social learning related to others' expectations in decisionmaking, as indicated by recent studies (Apps & Sallet, 2017; Sanfey et al., 2015). In summary, it appears that MFPC recruits controlled cognition to implement decisions related to oneself and calculate others' expectations in the context.

#### *Moral Judgment*

Moral judgment is the topic studying the judgment of right and wrong mainly respective to situations involving harm. Thus, most of the studies investigate moral judgment considering dilemmas such as the trolley problem (Thomson, 1984), in which the participant should decide between preserving individual rights from a single person and saving many others. This kind of task typically evaluates moral judgment in a utilitarian-deontological axis, considering cognitive reasoning relative to harm aversion. Furthermore, considering cortical brain regions, the majority of the studies investigated the modulation of two cortical structures, DLPFC and ventral MPFC (or just VMPFC), considering their role in other social phenomena (Boggio et al., 2016a; Darby & Pascual-Leone, 2017; Di Nuzzo et al., 2018).

Considering DLPFC modulation, one study by Tassy et al. (2011) investigated brain neuromodulation during moral judgment. The authors performed lowfrequency rTMS (known to generate cortical inhibition) over the right DLPFC during a moral dilemma judgment task, where the participant should judge whether he or she considered an immoral attitude acceptable. The authors observed a signifcant increase in utilitarian judgments (i.e., "the most good for most people") during active TMS compared to shame. In this way, this fnding points out the signifcant role of this structure in moral judgment, specifcally, in controlling emotional processes usually related to decreased utilitarian decisions. Similarly, Jeurissen et al. (2014) also demonstrated that low-frequency rTMS over right DLPFC was associated with moral judgment modulation in personal dilemmas (leading to less utilitarian responses), but not in impersonal or nonmoral dilemmas. The authors explained that the personal moral dilemmas are more emotionally salient; thus, the study suggested the right DPLFC role in cognitive control, probably dampening emotion processing and consequently enhancing utilitarian responses.

Regarding the use of tDCS, Kuehne et al. (2015) performed a task very similar to Jeurissen et al. (2014) concerning dilemmas with personal involvement nevertheless sought to modulate contralateral homologous region, that is, the left DLPFC. For this purpose, they performed three experimental conditions: two active conditions with target electrode (anodal or cathodal) over the left DPLFC and reference over the right parietal cortex and one sham stimulation. The authors found that only anodal condition presented signifcant moral judgment modulation compared to sham condition, showing a decrease in utilitarian judgments (greater frequency of deontological judgments), thus highlighting this structure's role in the left hemisphere in the process of moral reasoning. However, in a recent study conducted by Zheng et al. (2018), an opposite effect was found, where the authors performed balanced bilateral tDCS over the DLPFC, with left anodal, left cathodal, and sham conditions. A signifcant decrease in the utilitarian judgment was observed for dilemmas with personal involvement during the anodal on the right hemisphere (cathodal at left hemisphere), which is compatible with the results found by Jeurissen et al. (2014). However, these fndings revealed the need for experimental standardization since the positioning of the reference electrode can signifcantly impact tDCS effects.

Considering VMPFC role, many social neuroscience studies had shown this area involvement in empathy processes (Shamay-Tsoory et al., 2003), theory of mind (Shamay-Tsoory et al., 2005), and moral judgment (Greene, 2007; Moll & de Oliveira-Souza, 2007).

The frst work with neuromodulation of ventral medial prefrontal cortex (VMPFC) and moral judgment conducted by Fumagalli et al. (2010) investigated VMPFC modulation employing tDCS, with anodal and cathodal over this area or over the occipital cortex (control condition) and reference electrode over right deltoid. They performed a judgment task of moral dilemmas with personal involvement, without personal involvement, and nonmoral dilemmas. The authors observed that brain modulation was only effective in female participants (which already presented low levels of utilitarian judgments in comparison to the male participants at the baseline), who presented a greater frequency of utilitarian responses after anodal tDCS on VPFC and lower frequency after cathodal tDCS, compared to baseline trials (Fumagalli et al., 2010). It is worth noting that the authors did not fnd any signifcant effect for tDCS in the occipital cortex or sham condition. In this way, these fndings indicate that the neuromodulation of the VPFC may impact moral judgment and also highlights probably differences in brain circuitry for emotion processing between men and women (Fumagalli et al., 2010), as previously presented in the literature (Boggio et al., 2008). More recently, in a complementary way, Yuan et al. (2017) used a picture judgment task to assess moral judgment and arousal rating. Participants who received anodal tDCS on VMPFC (with reference electrode in the right deltoid) signifcantly increased moral judgment and arousal rating

compared to sham condition. The authors did not evaluate differences in sex that could complement (Fumagalli et al., 2010) fndings. Finally, recent work by Riva et al. (2018) investigated VMPFC modulation during a moral dilemma task, with the active electrode (anodal or cathodal) over VMPFC and the reference electrode over the occipital area. The fndings revealed similar effects to Fumagalli et al. (2010) and Yuan et al. (2017), where participants receiving anodal tDCS over VMPFC had a higher frequency of utility judgments.

Overall, the fndings regarding DLPFC and VMPFC's neuromodulation highlight these structures' essential causal role in moral judgment processes. However, all these fndings represent tasks of moral dilemmas, such as the trolley/train problem (Thomson, 1984), which only measures the participant's judgment concerning a deontological-utilitarian axis, without taking into account the different moral foundations (Graham et al., 2013).

#### **Affective Neuroscience**

Besides social phenomena, some studies have sought to understand several cortical brain structures' specifc role on affective phenomena, such as facial expression recognition and emotion regulation. It is a consensus that social and affective phenomena are closely intertwined (Boggio et al., 2016a, b), thus hindering the exclusive study of one of them. The following topics present the main fndings regarding neuromodulation to understand two of the main topics from affective neuroscience:

#### *Emotional Face Recognition*

One crucial use of neuromodulation was to investigate brain networks involved in the recognition of emotional facial expressions. One of the main areas investigated is the medial prefrontal cortex (MPFC). Some of the studies assessed low-frequency TMS on dorsal MPFC (Balconi et al., 2011; Balconi & Bortolotti, 2012; Harmer et al., 2001), where inhibition of this area by neuromodulation specifcally impaired recognition of facial expressions of anger and fear. TMS may have interfered with the dorsal anterior cingulate cortex's activity, usually responsive to negative valence emotions. Another possibility is related to the role of MFPC on other brain areas via top-down regulation, as indicated by one study coupling TMS and EEG where magnetic pulses delivered over right MPFC led to altered electroencephalographic early evoked potentials detected at temporal and occipital regions (Mattavelli et al., 2013).

In addition to the MPFC, other studies also assessed orbital and dorsolateral prefrontal areas to investigate their role in processing facial expressions. For example, Nitsche et al. (2012) applied anodal and cathodal tDCS over the left DLPFC (reference electrode positioned on the contralateral supraorbital region). They found enhanced performance in healthy subjects answering a facial expression identifcation task markedly for positive valence emotions and anodal tDCS. In another study, Willis et al. (2015) applied anodal tDCS over the right orbitofrontal cortex with reference over P3 (left parietal cortex). Compared to sham, active tDCS enhanced performance on facial expression recognition. It is essential to notice that implying those specifc prefrontal regions in emotion recognition is not so straightforward since tDCS is not so focal and the reference electrode could also interfere in the results. For example, Heberlein et al. (2008) investigated patients with prefrontal lesions in diverse regions, where they found that only patients with ventromedial lesions had impaired facial expression recognition and emotional expression. Besides, tDCS over prefrontal regions could indirectly act over ventromedial regions, leading to confounding results about what region is related to emotion recognition. One way to solve this problem is to use TMS, which is more focal. In a study by Ferrari et al. (2017), TMS was applied over the right or left DLPFC, and they found that both stimulations interfered in recognition of facial expressions, irrespective of emotion, similar to what Nitsche et al. (2012) found with tDCS. Thus, it is possible to implicate DLPFC in emotion recognition.

Other regions in the frontal lobe also investigated through neuromodulation methods are the supplementary motor area's anterior region (pre-SMA) and primary motor area (M1). Regarding the pre-SMA, Rochas et al. (2013) inhibited its activity through low-frequency TMS and investigated recognition of faces expressing happiness, anger, or fear. In this case, left pre-SMA disruption impaired recognition of happy faces but did not affect fear or angry faces. In addition to Nitsche et al. (2012), it is possible to hypothesize the left hemisphere's implication in processing positive valence, in line with previous neuroimaging and behavioral studies (Root et al., 2006). However, Ferrari et al. (2017) did not detect this, and there is still controversy in the literature supporting this lateralized valence theory (Root et al., 2006). The study by Rochas et al. (2013) hypothesized that disrupting pre-SMA led to impaired emotion recognition due to the mirror neuron system, i.e., disrupting the motor simulation of an expression in motor areas could also impair emotion recognition, similar to presented in simulation theories of emotion recognition (Gallese & Sinigaglia, 2011; Goldman & Sripada, 2005). Another study indicated the role of MNS in emotional face recognition, which found a positive correlation between cortex excitability of M1 (assessed by TMS) in response to movement observation and performance in facial expression recognition (Enticott et al., 2008).

Another critical region investigated in facial expression recognition is the temporal lobe, given the vital role of the superior temporal sulcus in processing dynamic facial features, such as eye gaze and facial expressions (Furl et al., 2014). Three studies by the same group investigated the contribution of the right occipital face area (rOFCA) compared to the right somatosensory cortex (rSC) (Pitcher et al., 2008) or the right posterior superior temporal sulcus (rpSTS) (Pitcher, 2014; Pitcher et al., 2014) in dynamic face processing. They found that all those areas contribute to recognizing facial expressions, with rOFA responsible for early processing of facial features (less than 100 ms), while rSC and rpSTS were responsible for posterior processing, despite still in the automatic domain (between 100 and 170 ms).

Furthermore, although rOFA stimulation disrupted facial expression perception, this area appeared to be more related to the processing of static facial features, whereas rpSTS stimulation disrupted precisely dynamic face recognition. Summing, these results indicate a network of the occipital and temporal area responsible for processing dynamic features of facial expressions (Pitcher et al., 2008, 2014; Pitcher, 2014).

Another relevant study that neuromodulated the temporal lobe is from Boggio et al. (2008), using tDCS and fnding opposite effects between women and men. In this study, they applied anodal tDCS over the left temporal and the reference over the contralateral region, which led to women's enhanced performance in detecting sad faces, while men performed worse due to stimulation. This study indicates differences among men and women in how the brain processes recognize basic emotions, which is specifcally problematic given other studies have not evaluated gender as a factor in their analysis.

Finally, two other studies investigated emotional face recognition. Ferrucci et al. (2012) stimulated the cerebellum through anodal and cathodal tDCS, where both polarities led to better performance in recognition of faces expressing emotions of negative valence. In another experiment, Cecere et al. (2013) inhibited the left occipital region through cathodal tDCS, while participants responded to a go/no-go task with images of fearful and happy faces. This experiment investigated the occipital cortex's role in integrating explicit and implicit stimuli (i.e., subliminal visual stimuli) showed to the left and right visual felds, respectively; it also investigated how unconscious emotional stimuli could facilitate behavior in a go/no-go task (correctly react to targets pressing a button). This study demonstrated a facilitation effect in the go/no-go task when explicit and implicit were congruent (showing the same expression of happiness or fear). However, after occipital cortex disruption by tDCS, this congruent facilitation disappeared, and implicit detection of fearful faces facilitated behavior, but only when the target was happy faces (similar to hemianopsia patients). The study demonstrated cortical (occipital cortex) role and subcortical routes in processing implicit visual information, showing occipital role in processing high-order level information regarding congruence, while subcortical routes' role was relevant for processing implicit fear stimuli.

In sum, neuromodulation studies indicate the existence of different systems between basic emotions, as suggested by neuroimage studies (Tettamanti et al., 2012; Diano et al., 2017), and it can vary between men and women. Modulation techniques also helped to elucidate the role of several brain areas (e.g., cerebellum, temporal, occipital, and frontal lobes) and of the MNS system in emotion recognition.

#### **Emotion Regulation**

Emotion regulation is the capacity to modify oneself or someone else emotional responses in order to intensify (upregulation) or diminish (downregulation) current emotion (Gross, 2014). Many studies on this topic focused on the emotional

reappraisal strategy, i.e., a technique to change the cognitive label of specifc emotional content. This preference is because this strategy is more effective in modulating the long-term emotional response (Gross, 2014), besides presenting a direct relation with cognitive control and the brain structures involved in this control (Ochsner et al., 2012), mainly the dorsolateral prefrontal cortex (DLPFC) and ventrolateral prefrontal cortex (VLPFC), due to the critical role of these structures on cognitive control, attentional orientation, response inhibition (Ochsner et al., 2012), and mediating amygdala's activity (Wager et al., 2008).

One relevant study on this topic is by Feeser et al. (2014). They investigated the role of right DLPFC anodal tDCS in using emotion reappraisal strategy (cathodal electrode positioned at contralateral supraorbital region). They found a signifcant increase in cognitive control measured by arousal ratings and skin conductance response (SCR). The typical variation according to reappraisal, i.e., higher for upregulation and lower for downregulation compared to observation only, was potentialized with anodal stimulation of DLPFC. These fndings clarify the signifcant role of the right DLPFC in cognitive control and emotion regulation through a reappraisal of negative valence content. In the same line, Pripf and Lamm (2015) and Rêgo et al. (2015) also found a signifcant impact of right DLPFC anodal stimulation on cognitive control. However, contrary to Pripf and Lamm (2015), Rêgo et al. (2015) also found that left anodal DLPFC condition signifcantly modulates emotion regulation, possibly due to increased attentional control, following Plewnia et al. (2015).

Thus, it seems that there is a misunderstanding between studies and relative to the neuromodulation of hemispheric sides. With this in mind, Marques et al. (2018) performed a study in order to investigate bilateral balanced DLPFC in two conditions compared to sham: (i) anodal left and cathodal right and (ii) anodal right and cathodal left. They did not fnd any signifcant impact of DLPFC tDCS on the emotional reappraisal of negative pictures. Notwithstanding, in a second study, they performed the same experimental procedures; however, over VLPFC, they found that left anodal VLPFC tDCS signifcantly impacted emotion reappraisal of negative pictures, increasing valence (more positive) regardless of emotion regulation strategy. Furthermore, they found a signifcant impact of left anodal VLPFC tDCS on the cardiac inter-beat interval, increasing cardiac recruitment on the frst seconds of emotional processing, indicating that this neuromodulation condition signifcantly increased participants' cognitive engagement, and also leading to an increased valence estimation.

Thus, following the discussion of Paulo S Boggio et al. (2016b), these fndings indicate several particularities of each mentioned brain structure on emotion regulation, as the role of DLPFC on cognitive control (Ochsner et al., 2012) and VLPFC on attentional control (Wager et al., 2008). Future studies should standardize the experimental protocol between studies due to signifcant discrepancies in the literature related to electrode size, current intensity, cathode positioning, and emotion regulation tasks. Moreover, as highlighted by Kim et al. (2019), future studies should also use TMS as an exciting technique to address both DLPFC and VLPFC's role in emotion regulation.

#### **Conclusions**

To conclude, the transcranial stimulation methods, tDCS, and TMS have been an important tool to investigate cortical circuits' role in several social, like prejudice, social decision-making, and moral judgment, and affective processes, like emotion recognition and regulation. Those techniques were essential to demonstrate several brain areas' role in a plethora of previously described processes in neuroimaging studies. TMS studies could also demonstrate the role of different areas in a brain network across time, which is very relevant to indicate how the brain integrates complex information among several cortical areas.

The observed cognitive and behavioral effects in response to brain modulation are of great relevance since they can indicate the future use of these neuromodulation techniques to modulate brain activity noninvasively in clinical patients with social or affective disorders to ameliorate their clinical condition.

#### **References**


Thomson, J. J. (1984). The trolley problem. *Yale LJ, 94*, 1395.


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 16 What Our Eyes Can Tell Us About Our Social and Affective Brain?**

**Paulo Guirro Laurence, Katerina Lukasova, Marcus Vinicius C. Alves , and Elizeu Coutinho de Macedo**

> *The only true voyage of discovery, (… would be) to possess other eyes, to behold the universe through the eyes of another, of a hundred others, to behold the hundred universes that each of them beholds, that each of them is. Marcel Proust, Remembrance of Things Past (or In Search of Lost Time)*

**Abstract** The eyes are windows to the soul. This phrase present in the common sense popularly expresses that it is possible to deeply understand people's minds just by how their eyes behave. This assumption is not that far from reality. Analyzing the eyes of subjects, researchers have answered questions of how people think, remember, pay attention, recognize each other, and many other theoretical and empirical ones. Recently, with the advancement of research in social and affective neuroscience, researchers are starting to look at human interactions and how the individuals' eyes can relate to their behaviors and cognitive functions in social contexts. To measure individuals' gaze, a machinery specialized in recording eye movements and pupillary diameter changes is used: a device known as an eye tracker.

**Keywords** Eye tracking · Pupillometry · Cognitive ethology

K. Lukasova

#### M. V. C. Alves Faculty of Health Sciences of Trairi, Universidade Federal do Rio Grande do Norte, Santa Cruz, Brazil

P. G. Laurence · E. C. de Macedo (\*)

Social and Cognitive Neuroscience Laboratory and Developmental Disorders Program, Center for Health and Biological Sciences, Mackenzie Presbyterian University, São Paulo, Brazil

Postgraduate Program in Neuroscience and Cognition – PPGNC, Federal University of ABC – UFABC, São Bernardo, Brazil

#### **Introduction**

The eyes are windows to the soul. This phrase present in the common sense popularly expresses that it is possible to deeply understand people's minds just by how their eyes behave. This assumption is not that far from reality. Analyzing the eyes of subjects, researchers have answered questions of how people think, remember, pay attention, recognize each other, and many other theoretical and empirical ones. Recently, with the advancement of research in social and affective neuroscience, researchers are starting to look at human interactions and how the individuals' eyes can relate to their behaviors and cognitive functions in social contexts. To measure individuals' gaze, a machinery specialized in recording eye movements and pupillary diameter changes is used: a device known as an eye tracker.

Eye tracking as a research tool is more accessible than ever, and since it allows different inferences about mental functioning at a less expense of researcher grants, its popularity has grown exponentially in psychology and cognitive neuroscience laboratories. The eye-tracking device is a nonintrusive machine that normally emits infrared/near-infrared light to create a refection in the cornea of the subject. This cornea refection corresponds to the frst Purkinje image (P1) obtained from the refection of eye structures and is commonly known as a "glint." This refection and the center of the pupil are used to track eye movements. The corneal refection is captured by a camera in the eye-tracking device, and it is possible to calculate a vector formed by the angle between the corneal refection and the pupil. Those features enable the software to calculate the gaze direction. For the software to be fully capable of capturing eye movements, a calibration procedure is required, consisting of a presentation of dots on the screen which the subject should normally follow while the device registers the position of the eye (with the refection) in order calculate references of where the person is looking (Hansen & Ji, 2010).

The eye-tracking equipment is able to record some helpful measures. Eyetracking measurements can be divided into four large groups, being (1) movement measures (how the eyes move through space and the properties of these movements), (2) position measures (dealing with where a participant has or has not been looking and its properties), (3) numerosity measures (proportion or rate of any countable eye movement event), and (4) latency measures (how long these events take to start and fnish). Thus, depending on the question asked by the researcher, it is possible to answer with several possible measures, for example, if the study shows different emotional faces to the participant, it is possible to verify how much and in which places the participant's eyes move around to process those faces, the time it takes to do these scanpaths, the parts of the face that the participant looks at, the number of times he checks essential points of the faces (i.e., mouth, eyes), and even the time he keeps processing any of these points. Each of these indices will be able to answer different questions and may be also integrated so that we can make inferences about underlying cognitive processes.

In relation to movements, eye-tracking devices can record saccades and fxations. Saccades are the movements that the eyes do when they are searching for stimuli in the environment, while the fxations are brief moments when the eyes stop to look at something more carefully. During fxations, the visual resolution is optimal, and the visual system receives information about retinal input that is the moment when we process information and plan the next saccade to the objects of interest. In other words, the eyes are always in movement; even when fxation takes place, the eyes perform very small jitter, but for classifcation purposes, the eye movements are divided into those two categories (Liversedge et al., 2011). The measure recorded by eye-tracking devices is the pupil dilation, which is calculated from pupil diameter changes during the task execution (Sirois & Brisson, 2014).

Fixations are a great way to study emotion recognition based on facial expressions. For example, when a person is visually scanning a human face in order to recognize an emotion, they fxate approximately 88% of the time on facial regions including the eyes, nasion, nose, or upper lip (see Fig. 16.1). Emotions such as fear, anger, sadness, and shame have fxations predominantly on the region of eyes, while other emotions, such as joy and disgust, draw more attention toward the upper lip. This fxation pattern is related to optimizing the visual search for cues that are important for emotion identifcation. For example, the deformation of the upper lip characteristic during a smile is an important feature of joy. Moreover, the lower part of the nose seems to be a key region to differentiate between emotional faces. Those

**Fig. 16.1** Regions of interest in the face for emotion recognition. (This image and the regions of interest were based on the manuscript of Schurgin et al. (2014))

results suggest that there are certain diagnostic regions in the face for emotion processing (Schurgin et al., 2014).

Besides looking at the emotion expressed by the face, another important aspect is face recognition per se. One important fnding has shown that people are generally better at face recognition of their own race, and this process is called own-race bias (ORB). Then trying to recognize faces of own race than faces of another race, participants had a shorter response time. Studies with eye tracking helped to understand this phenomenon. In a study with Caucasians trying to remember if they saw other Caucasian faces or Asiatic faces, it was possible to understand that a more complex scanning happened when Caucasians looked in their own-race faces. In own-race faces, they performed more saccades and more fxations. Additionally, these fxations were shorter than in other-race faces. The distance of saccades was not different when trying to recognize faces of own race or other races (Wu et al., 2012).

When trying to recognize a face, a person looks more than 70% of the time on the eyes, nose, and mouth. The participants spent more time looking in the region of eyes and forehead, while less time is spent looking at the nose when the face is an own-race face. This gaze pattern points to a different strategy of visual processing when trying to recognize faces of own race compared to other races. The visual scanning of own-race faces is done in a more automatic, quick, and effortless process than in other faces (Alves & Bueno, 2017; Wu et al., 2012).

Since the effective evaluation of the facial expression and a correct inference of the affective states are important for people's social interaction, studies looked at the strategies used for face processing. They showed that when people are looking at static faces, they tend to direct their gaze to the right side of the face, the so-called left (hemispace) gaze bias, and this preferential looking is already present in children (Gilbert & Bakan, 1973; Sackeim et al., 1978; Heller & Levy, 1981; Hisao & Cottrel, 2008; Chiang et al., 2000; Taylor et al., 2012). Balas and Moulson (2011) registered eye gaze of children 5–10 years old while looking and judging face similarity of proof and a target face. They confrmed left-side bias in children 5 years old and showed an increase for left-side preference with age, however only when looking at human faces. No effect was found when children were looking and judging monkey faces. Together with other studies, the fndings indicate that over the developmental trajectory, people improve their looking strategies together with acquiring expertise in human face judgment. Indeed, looking to the left hemiface may be more informative. Several studies examined composite photographs of human and chimera faces and asked whether the left-left composites were more informative than right-right hemiface composites. In most cases, the left-left photographs were judged as more emotionally expressive (Moreno et al., 1990) and more trustworthy (Okubo et al., 2013) and had more muscle movements (Dimberg & Peterson, 2000). Nicholls and colleagues (2002) found the left-gaze bias also in faces turned slightly to the side 15o , and it raised a question whether the same eye movement strategies are to be found in faces viewed from different angles and in natural dynamic setup.

#### **Pupillometry**

Pupillometry is a measure of pupil diameter variance (i.e., pupil dilation) in the course of time. In one of the earliest studies with pupil diameters, scientists took pictures of the participants when performing tasks and then compared them with a baseline, that is, during a period when no task was done (Hess & Polt, 1960, 1964; Kahneman & Beatty, 1966). Since then, with the development of video-based eye trackers, the scientifc interest in pupillometry has been growing.

Pupils' diameter changes in order to allow more light to enter the eye and reach the retina, increasing our vision in dim light conditions. However, the pupil diameter also increases in response to cognitive processing, such as performing a test or contemplating a photograph with strong emotional content. Numerous studies that used pupillometry as a complementary measure in the execution of cognitive tasks demonstrated that the magnitude of change is directly related to the tasks' cognitive demands. The change in pupil dilation related to the use of cognitive resources is minimal if compared to the change due to the change in luminosity, and while the former tends to vary by less than 1 millimeter, the latter may imply changes of up to 8 millimeters. This small, but conspicuous, difference is used to infer the way participants are allocating mental resources to perform demanding tasks. It is well known that change in pupil diameter is an effective indicator of a person's mental activity (Hess & Polt, 1964; Kahneman & Peavler, 1969). Pupillometric studies provide evidence that pupil dilation is related not only to processing emotional states but also to increasing mental effort that is undertaken on a task (Eckstein et al., 2016; Hess & Polt, 1964; Kahneman & Peavler, 1969; Mathôt, 2018; Wierda et al., 2012).

Another two indices useful to the mental effort-related hypothesis are the pupil dilation peak and the eye blink rate. The peak of dilation – arguably as reliable as pupil dilation – can be related with the peak of effort during a task, since stabilization of the dilation can happen after the beginning of tasks (Beatty & Lucero-Wagoner, 2000; Hershaw & Ettenhofer, 2018), while eye blink rate is a complementary measure that can refect cognitive engagement, usually, before a high-demanding task begins (Siegle et al., 2008; Van Bochove et al., 2013). In view of that, more blinks represent more preparation for doing a hard task and, with pupil dilation, can be used to indicate an effortful task (Fukuda et al., 2005; Ichikawa & Ohira, 2004).

#### **Cognitive Ethology: From the Real World to the Lab and from the Lab to Virtual Reality**

A prominent research approach to eye tracking is called cognitive ethology, mostly studying everyday attention and social interactions (Kingstone, 2009; Smilek et al., 2006). The goal is to frst begin one's research approach at the level of natural performance before moving it into the lab where it can be recreated, controlled, and manipulated. Cognitive ethology ends up being an alternative way of studying attentional processes when related to social interactions. By starting at the realworld level, the main focus is on what people really do in real life, and hence, one can determine what behaviors are, and are not, specifc to the laboratory environment (Kingstone, 2009).

People have strong tendencies to follow gaze cues. With the help of an eyetracking device, MacDonald and Tatler (2013) investigated whether social perceptions of a collaborator affects how people look at them and follow their gaze. Namely, they aimed to understand how social context can affect our gaze behavior during social interaction. With an experiment in which two participants worked together to perform a task (in their case, cooking), they found results showing that social context can affect gaze behavior, that is, the social context infuenced the way the participants interacted with their eyes, focusing their attention depending on the action of the other. This result points out the use of eye tracking in social research and attempts to carry out an experiment in naturalistic environments to show how social attention works in natural social contexts.

Besides eye tracking, another technology that may help investigate social neuroscience is virtual reality (VR). VR is interesting to social neuroscience because it allows the creation of ecologically valid experiments that can be fully interactive and three-dimensional (Parsons et al., 2017). One study proposed to create the trolley dilemma in VR. This is a well-known series of experiments on moral decisionmaking on whether to sacrifce one person to save a larger number of people by making a certain action, such as diverting the incoming trolley on a sidetrack. In the VR version of the task, participants had to choose killing either ten victims or one victim. They created three conditions for the experiment environment, the frst one with randomized women and men as possible victims, the second one with possible victims of different ethnicity, and the third one with a possible victim facing toward them and a possible victim facing away from them. Results from eye tracking pointed that the participant spent more gazing time on the chosen victim, which was an unexpected result since it was expected that they would avoid looking at the victim (Skulmowski et al., 2014).

In this experiment, the variation of the pupillary diameter was also verifed. In relation to pupil variation, in all conditions, the pupil presented an increased diameter after the moment of decision, indicating that the participants had an increased cognitive load in the moment of decision. In the different ethnicity condition, participants presented a higher pupil dilatation, suggesting that the participants had a higher cognitive load in an extreme social decision situation due to the controversial topic (Skulmowski et al., 2014). In the previous study on the faces of different or equal races, the pupillary diameter was also recorded, indicating that a person will have a bigger pupil variation when trying to recognize other race faces. This is consistent with the gaze pattern that was already described above, indicating that a person will have more cognitive effort to recognize the face of another race, while own-race faces will be more automatic.

#### **Eye Tracking in Clinical Populations**

Another interesting way to use eye tracking is with studies in clinical populations with impaired social interaction, such as study with individuals diagnosed with schizophrenia, autism, or social anxiety disorder. Since eye tracking can demonstrate underlying cognitive patterns of a person, it can be a good tool to understand how different clinical populations understand and process different stimuli. For example, eye tracking can help us understand which part of a stimulus (e.g., faces) a person more fxates on, indicating where is the part of the face that a person applies most attention to. Thus, it is possible to infer different cognitive processes of a clinical population when comparing their eye gaze with a typical population.

In relation to schizophrenia, there have been a large number of studies with eye tracking that goes beyond the scope of this work. The fndings point out to different aspects of eye movement impairments in persons with schizophrenia, one of them being an impaired smooth pursuit. Smooth pursuit happens when eyes follow a moving object. In persons with schizophrenia, the smooth pursuit lags behind the moving object, and thus a series of saccades is made to catch up the target (O'Driscoll & Callahan, 2008). It has long been known that this population presents a worse performance in anti-saccade tasks (Fukushima et al., 1988) during which the participant must avoid looking at a suddenly appearing target and is supposed to look in the opposite direction. Recently, new experiments revealed a worse performance in the fxation task (Benson et al., 2012) assessed by a study that asked the participants to visually fxate on the point ignoring a cue appearing in the peripheral area. Furthermore, on free-viewing tasks, participants with schizophrenia tend to focus their gaze on a smaller area, if compared with typical participants (Sprenger et al., 2013).

Since people with schizophrenia present a different eye movement pattern, compared with typical persons, there are some discussions regarding the use of the eye gaze as a biomarker for schizophrenia. This is possible because eye movements are underlaid by different neurological mechanisms that can be altered in persons with schizophrenia. In this regard, Morita et al. (2020) made a review describing the fndings in this area. The results suggest that eye movements can be used to discriminate between persons with schizophrenia and typical subjects at a rate of ~75–90% (Morita et al., 2020).

The eye movements of persons with autism spectrum disorder (ASD) also seem to be different from typical persons. These regions may not be identifed by persons with some developmental type of disorders. One meta-analysis reviewed studies on face processing and showed that children with ASD have signifcantly reduced the number of fxations in the region of the eyes. Furthermore, diminished attention on eyes negatively impacts social interaction because not looking at social cues may lead to worse interaction and emotion recognition (Papagiannopoulou et al., 2014). The same meta-analysis demonstrated that there were no signifcant differences in mouth region fxations for children with or without ASD. Another meta-analysis of 38 studies revealed that individuals with ASD present reduced social attention if compared with typical individuals and that the social attention in persons with ASD is infuenced by social contents (Chita-Tegmark, 2016). However, a comparison of the gaze pattern in different regions of interest, this time in a meta-analysis involving 122 studies, found differences of small and medium magnitudes (Frazier et al., 2017). In special, participants with ASD presented a higher diffculty in selecting socially relevant or nonrelevant stimuli. The biggest difference was again found in the eyes and whole face regions of interest (Frazier et al., 2017).

Very promising results come from studies on social attention in toddlers (18–35 months old) with and without ASD. Specifc signs of ASD may be indicated by subtle variation in the way the child follows another person's look to the target of interest, an ability called joint attention. There are two principal kinds of joint attention: the *response joint attention* that requires to spot the change of the other person's look and follow it to the new destiny and the *initiation joint attention* that requires the child to look at a moving object and by her/his own gaze indicate this fact to another person. While at 24 months of age the eye-tracking pattern, especially in initiating joining attention, was different in ASD toddlers compared to typically developing children, by 6 months later, this difference disappeared. Due to the natural maturation, the ASD improved their ability to disengage from the face stimuli and explore the global aspect of the scene approaching their eye moving pattern to the performance of typically developing children (Muratori et al., 2019).

Lastly, persons with social anxiety disorder (SAD) also present peculiarities in their eye gaze. A meta-analysis containing 13 studies demonstrated that participants with SAD presented a hypervigilance-avoidance effect in their eye gaze when looking into faces, compared to typical participants (Claudino et al., 2019). This eye gaze effect can be understood by a big number of fxations in the face at the frst moment and then less fxation in the stimulus at a second moment. Claudino et al. (2019) also found that this effect was more prominent in faces presenting negative emotions, such as anger.

#### **Conclusion**

The measurement of eye movement and pupil dilation is a valid undertaking for studies in cognitive, social, and affective neurosciences. With this technique, it is possible to carry out an ecological evaluation, which is cheaper and answers several important experimental questions. Using typically developing or clinical populations of different age groups and even allowing constant social interactions during the experiment, the device allows a series of inferences on cognitive processing based on objective, simple, and noninvasive physiological measures. The use of eye tracking by different behavioral disciplines depends more on the limit of what the researcher is willing to investigate than on the technique per se. To sum up, the possibilities of research questions that can be answered by participants' eyes go much further than expected or, rather, go beyond what the eyes can see.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 17 Facial EMG – Investigating the Interplay of Facial Muscles and Emotions**

**Tanja S. H. Wingenbach**

**Abstract** This chapter provides information about facial electromyography (EMG) as a method of investigating emotions and affect, including examples of application and methods for analysis. This chapter begins with a short introduction to emotion theory followed by an operationalisation of facial emotional expressions as an underlying requirement for their study using facial EMG. This chapter ends by providing practical information on the use of facial EMG.

**Keywords** Electromyography · Facial EMG · Facial emotional expressions · Facial muscles

#### **Introduction**

This chapter provides information about facial electromyography (EMG) as a method of investigating emotions and affect, including examples of application and methods for analysis. This chapter begins with a short introduction to emotion theory followed by an operationalisation of facial emotional expressions as an underlying requirement for their study using facial EMG. This chapter ends by providing practical information on the use of facial EMG.

#### **Theory: From Emotional States to Their Expression**

Darwin (1872/1965) studied emotions and their expression across species and argued that emotion phenomena were the products of natural selection. According to this evolutionary perspective, emotions constitute an interrelated suite of

T. S. H. Wingenbach (\*)

School of Human Sciences, Faculty of Education, Health, and Human Sciences, University of Greenwich, Greenwich, London, UK e-mail: tanja.wingenbach@bath.edu

physiological and behavioural systems that have guided adaptive action over evolutionary time. According to Tomkins (1962), specifc response patterns related to emotion experience are elicited automatically by certain events. For example, a threat or danger in the perceived environment should elicit fear. Emotion responses are characterised by coordinated patterns of activity that can include physiological changes, signalling behaviours in the voice and face, subjective experience, and relevant action. For example, a fearful response includes changes in brain activity in the amygdala (a region in the brain associated with emotion processing in general but specifcally with fear) (Janak & Tye, 2015). Changes in physiology during a fearful episode can manifest as an associated facial expression (i.e. wide opened eyes, eyebrows pulled upwards and drawn together, and the corners of the mouth pulled outwards), the face turning pale, sweating, and a vocal expression (e.g. fear scream). Fear can direct our attention to the dangerous situation and facilitate adaptive action, such as feeing. Emotions allow us to navigate life's challenges, and each emotion is governed by its own adaptive logic.

Several theories consider emotions as distinct entities and as biologically innate (e.g. Ekman et al., 1982; Izard, 1977; Plutchik, 1980; Tomkins, 1984). A very prominent theory is the 'basic emotion theory' (Ekman, 1992a, b), according to which some emotions are considered universal, meaning they occur in humans across all cultures. Most theorists agree on at least six basic emotion categories: anger, disgust, fear, sadness, surprise, and happiness (Ortony & Turner, 1990). According to Tomkins (1962), each emotion has its unique affect programme such as the example outlined above in the case of fear. Ultimately, these categories map onto distinct patterns of activity shaped by evolutionary processes to solve different kinds of adaptive problems faced by our highly social hominin ancestors. Research has provided evidence for distinct patterns in physiology on the basis of heart rate, temperature, and electrodermal activity for the six basic emotions, and these varying physiological patterns can be linked to functions of emotions on a behavioural level (as proposed by Darwin). In a state of anger, a preparation for fghting occurs by increasing the blood fow to the hands (Levenson et al., 1990). In a state of fear, the blood fow to large skeletal muscles increases which prepares for a fight reaction (Levenson et al., 1990). A state of disgust will lead to a rejection of the eliciting stimulus by restricting airfow to olfactory receptors and triggering a gag refex (Koerner & Antony, 2010). A state of sadness results in a loss of muscle tone (Oberman et al., 2007), slowing us down, allowing us to focus on the issue that induced the sadness (Wolpert, 2008). A state of happiness leads to an increase in the energy available to the organism by releasing respective transmitters (Uvnäs-Moberg, 1998). A surprised state results in air being quickly inhaled which increases the ability to react fast (Ekman & Friesen, 1975), as it interrupts ongoing processes (Tomkins, 1962). Even participants' subjective understanding (i.e. conceptualisation) of emotion refects distinct patterns for each of the six basic emotions. When asking participants to colour in the body parts they perceive to be affected by either an increase or decrease in sensations when being in a state of each of the six basic emotions, the obtained results were in line with associated physiological changes as outlined above (see Nummenmaa et al., 2014). Neuroscientifc research has shown that the distinctiveness of emotions is also evident in brain activity patterns. Vytal and Hamann (2010) conducted a neuroimaging meta-analysis and found distinct patterns of neural correlates for anger, disgust, fear, happiness, and sadness. The evidence presented here supports the assumption that there are distinct response patterns of emotions at least for the basic emotions.

One alternative view is that emotions can be characterised as the integration of at least two fundamental dimensions: *valence* and *arousal* (Russell, 1980). Russell (1994) views the dimensions of valence and arousal as universal to emotions but questions the universality of distinct emotion categories. The valence dimension spans from negative (i.e. unpleasant) to positive (i.e. pleasant). The arousal dimension ranges from low (i.e. deactivated) to high (i.e. activated). Any affective state can be represented as a combination of these two dimensions. Multidimensional scaling thus reveals similarities and dissimilarities between affective states. For example, sadness is an emotion considered as negative in valence and low in arousal, whereas anger is considered also as negative in valence but high in arousal. As such, the dimensional conceptualisation of affect and the categorisation of emotions are not mutually exclusive and can actually complement each other (see Harmon-Jones et al., 2017). However, it should be noted that not all affective states are emotions, while emotions always are affective states. For example, the longer-lasting affective states are called 'moods', and emotions are rather short-lasting, while other affective states overlap with cognitive states, e.g. confusion and boredom.

#### **Facial Emotional Expressions**

The changes occurring throughout the body in an emotional state such as a face turning pale in a state of fear are visible to an observer and provide information about the affective state. Moreover, some physiological changes during the experience of emotion result in movement. For example, the activation of facial muscles leads to facial movement manifesting as facial expressions. Unlike skeletal muscles in the human body that are generally attached to bones, facial muscles also attach to each other or to the skin of the face. This anatomical set-up allows even slight contractions of facial muscles to pull the facial skin and create a facial expression visible to others. The general number of facial muscles in humans is 43, although this number can vary between people (Waller et al., 2008). This large concentration of muscles in a narrowly defned space (i.e. the face) allows for the execution of many different facial movements and results in various expressions. The *Facial Action Coding System* (*FACS*; Ekman & Friesen, 1978; new edition: Ekman et al., 2002) is an anatomical catalogue describing all movement-related facial actions (i.e. action units (AUs)) possible in humans. As a result, *FACS* has become a widely used tool in facial emotion research.

For emotional facial expressions to send an interpretable signal and serve as a means of communication, the emotion needs to be expressed in a certain way for it to be clearly attributable to a specifc emotion. Ekman et al. (2002) provided suggestions for AU combinations that align with basic emotional expressions. For example, the activations of AU 9 (nose wrinkle), AU 10 (upper lip raise), and AU25 (lips parted) together result in a facial expression displaying disgust. Since facial actions as outlined by the AUs are the result of facial muscle activations, facial muscles can be linked to specifc AUs. Sticking with the example of disgust, the activation of the levator labii muscle leads to a wrinkling of the nose and a raised upper lip. The connection between facial action and muscles also provides the association with specifc emotions. Table 17.1 shows the six basic emotions with associated AUs and facial muscles. The facial expressions resulting from AU activations per emotion category are considered prototypical and align with the universality assumption of basic emotions as proposed by Ekman (Ekman & Friesen, 1971). When participants are shown images (or videos) displaying these prototypes, attributions of the respective emotion label are generally high. Most facial emotion recognition research utilises prototypes of basic facial emotional expressions, and many stimulus sets including these prototypes have been developed for these purposes (e.g. Ekman & Friesen, 1976; Krumhuber et al., 2013; Matsumoto & Ekman, 1988; Tottenham et al., 2009; Van Der Schalk et al., 2011; Wingenbach et al., 2016; Young et al., 2002).

As mentioned above, there are inter-individual differences in humans regarding their number of facial muscles. This variability raises the question of how


**Table 17.1** Basic emotions with associated AUs and facial muscles

prototypical displays of facial emotion are possible. Would it not require a standard set of facial muscles to produce expressions specifc to basic emotions (as presented in Table 17.1)? To address this question, Waller et al. (2008) investigated whether the facial muscles underlying facial movements associated with facial emotional expressions of basic emotions are affected by inter-individual variability. These researchers dissected recent human cadavers and documented whether specifc facial muscles were absent or present and whether this was the case for both sides of the face. The facial muscles investigated were the frontalis, orbicularis oculi, zygomaticus major, depressor anguli oris, orbicularis oris, procerus, corrugator supercilii, zygomaticus minor, buccinator, mentalis, depressor labii inferioris, risorius, levator labii superioris, levator labii superioris alaeque nasi, nasalis, and depressor septi. The frst fve facial muscles of this list were considered essential for the production of facial emotional expressions associated with the expression of basic emotions by Waller et al. (2008). Their results showed that the facial muscles assumed to be necessary to produce basic facial emotional expressions were present, mostly bilaterally, in all of the dissected cadavers. In addition, muscles commonly associated with the expression of basic emotions (as outlined in Table 17.1) were, although not always bilaterally, present in all cadavers, i.e. the corrugator, mentalis, depressor labii inferioris, and both levator labii muscles. The other facial muscles investigated were not present in all cadavers, and many were only present unilaterally. These fndings support the universality assumption of basic emotions, at least in terms of facial expressions.

#### **Investigating Facial Emotional Expressions Using Facial EMG**

In some instances, participants' facial expressions are video-recorded while they are undergoing an experiment, and the recorded facial expressions are subjected to analyses. The *FACS* (Ekman & Friesen, 1978) can be used to code the presence of specifc facial AUs, and a combination of certain facial AUs can be indicative of the presence of a specifc facial emotion. For example, the co-presence of the AU6 (raising the cheek) and AU12 (pulling lip corners outwards) would indicate the presence of a facial expression of happiness. Applying this method requires *FACS* training and is subject to inter-individual perceptual differences. For these reasons, automated facial action coding software has been developed based on *FACS* (e.g. FaceReader). When using the FaceReader software, video recordings of faces can be imported, and the software output provides coded AUs as well as timings for the six basic emotions, valence, and arousal values. However, good video quality and clearly visible faces are necessary for automatic detection of AUs/emotions, and thus, trained human decoders can outperform the software.

Whether AUs are coded by humans or by software, visible movements are required for an AU to be coded. An alternative method for investigating facial emotional expressions is using facial electromyography (EMG). A great advantage of facial EMG is that it is a highly sensitive method able to ascertain the slightest contractions in facial muscles. Since fatty tissue and skin are covering the muscles in the face, very slight muscle contractions are not necessarily visible to the naked eye but do occur nonetheless during the processing of emotion-related stimuli or the presence of emotion. It should be noted that emotional states are not always expressed, as the expression thereof often has communicative or signalling function (Fridlund, 1994) that does not apply to all emotion-inducing situations. However, facial muscle contractions non-visible to observers are measurable using facial EMG (Cacioppo et al., 1986). Consequently, facial EMG can also detect facial muscle activity congruent with the affective state even when participants are instructed to suppress their emotional expression (Cacioppo et al., 1992).

So, how does facial EMG work? Whenever muscles are contracted, electricity is generated through the combined action potentials of an active motor with the measurement unit being either millivolt (mV) or microvolt (μV). These action potentials are the result of depolarisation and repolarisation at the muscle fbre membrane. When a motor nerve is excited, transmitters are released in the motor endplates, and a potential is formed in the muscle fbre (Nazmi et al., 2016). Even during a resting state when muscles are not contracted, a muscle tonus is present which can be measured with EMG. The presence of this muscle tonus is the reason why baseline measures often need to be taken, i.e. to be able to evaluate the reaction to a stimulus relative to the baseline activity; the fast nature of facial expressions makes using a prestimulus baseline necessary. Two detecting electrodes are needed to assess the electricity in one muscle, one negative electrode (VIN-) and one positive electrode (VIN+). An additional electrode is used as a reference point, i.e. ground electrode. There are two different kinds of electrodes for EMG. Needle electrodes are more commonly used within medical settings, and surface electrodes (which are noninvasive) are generally used in psychological studies. This is because surface electrodes do not require medical training and do not risk infection and discomfort. It should be noted though that surface electrodes are not necessarily muscle-specifc, as they can pick up muscle activity from a greater area than the confned area around the needle insertion point. Thus, it is advised to speak of facial muscle *sites* instead of specifc muscles when measuring facial EMG. Guidelines on using facial EMG were published by Fridlund and Cacioppo (1986), which are still considered the gold standard today.

#### **Investigating Affect and Emotion Using Facial EMG**

Affective states are associated with physiological responses across the body as described earlier, so one obvious use of facial EMG within emotion research is to investigate the presence of these affective states. Physiological measures such as electrocardiogram and galvanic skin response have long been applied when examining affect or specifcally affective arousal (Alexander & Adlerstein, 1958; Block, 1957; Dimascio et al., 1957; Goldstein et al., 1965; Kaiser & Roessler, 1970; Oken, 1962; Vogel et al., 1958). Whereas most physiological measures are useful tools to measure affective arousal, they do not allow one to easily identify the valence of the experienced affective state. But in the 1970s, researchers started to use facial EMG and demonstrated its usefulness for differentiating affective states based on valence. For example, Schwartz et al. (1976) instructed participants to imagine happy, sad, and angry situations. The researchers distinguished between sad and happy states based on measurements from the corrugator and zygomaticus muscle sites. Cacioppo et al. (1986) demonstrated that based on measurements of the corrugator and zygomaticus facial muscle sites, mildly and moderately experienced affect can be differentiated according to its valence and also intensity. It should be noted that the resulting facial muscle activity in Cacioppo et al. (1986) was mainly covert (i.e. not visible), again highlighting the sensitivity of facial EMG. Such research fndings underpin the association between the corrugator muscle site activity and negative affect (i.e. frowning) and the zygomaticus site activity with positive affect (i.e. smiling).

Published research thus far has most often investigated the facial muscle sites of corrugator and zygomaticus despite there being at least fve muscles that are considered essential for the facial expression of basic emotions (see Waller et al., 2008). A reason for the preference of investigating the corrugator and zygomaticus facial muscle sites could be that a rudimentary differentiation of stimuli as either positive or negative is considered the frst occurring process when faced with affective stimuli (Zajonc, 1980) and allows for investigation including a variety of stimuli of positive or negative valence without having to categorise the stimuli in distinct emotion categories. The categorisation and interpretation of an affective stimulus in specifc emotion categories is often diffcult. For example, a visual stimulus such as a static picture or a movie scene is often complex and can elicit a range of emotions. For instance, a scene of a bully physically attacking a person (from the flm *My Bodyguard*) can elicit disgust and contempt for the bully and anger (and/or sadness) about the situation (see Gross & Levenson, 1995). The general responsiveness of the corrugator and zygomaticus muscle sites to negative and positive valence stimuli, respectively, overcomes this diffculty and makes them the standard choice within facial EMG research related to affect and emotion.

Another potential reason for not generally including multiple facial muscle sites in facial EMG research can be the issue of 'crosstalk'. That is, when neighbouring facial muscle sites are investigated, electrode pairs are necessarily placed close to one another. It is possible that an electrode pair of a non-activated muscle site records some of the activity from an adjacent activated muscle site, thus confounding results (Farina et al., 2004). Challenges like this might constitute one reason researchers generally measure fewer facial muscle sites that are not in close proximity. Corrugator and zygomaticus facial muscle sites are of suffcient distance from one another to not create crosstalk but also do not tend to activate simultaneously. Technological advances, however, have led to the recent development of smaller electrodes (i.e. with an outer diameter of <1 cm) which when placed carefully can potentially increase the number of electrode pairs used while still minimising possible crosstalk.

Corrugator and zygomaticus muscle sites are standard in facial EMG research, but there are many studies that included more facial muscles sites. For example, Vrana (1993) investigated multiple facial muscle sites to discriminate varying emotion experiences based on facial EMG. This researcher employed an imagery technique to have participants experience disgust, anger, pleasure, and joy while facial muscle activity was measured from the levator labii, corrugator, and zygomaticus sites. Results showed (1) higher activity in the levator site during disgust imagery than during anger imagery, (2) greater corrugator site activity during disgust and anger imagery compared to pleasure and joy imagery, and (3) increased zygomaticus site activity during joy imagery compared to anger, disgust, and pleasure imagery. This approach of comparing various emotion categories to each other based on the facial EMG activity at one muscle site is very common in facial EMG research. The approach is based on the assumption that specifc facial action activation is indicative of a specifc emotion such as a wrinkled nose resulting from the levator labii activation during the expression of disgust. However, facial emotional expressions generally include more than one facial feature activation, and some emotion categories share facial features. For example, corrugator activation is associated with facial expressions of anger, sadness, and fear (see Table 17.1) based on the overlapping facial feature of eyebrows pulled together. Such overlaps can make it diffcult to draw precise conclusions about specifc emotions based on individual muscle sites.

An alternative to investigating one facial muscle site per emotion category is to examine co-activations across several facial muscle sites for each emotion category. According to basic emotion theory, patterns of facial muscle activity should distinguish well between emotion categories. Fridlund et al. (1984) instructed participants to imagine situations related to feeling happiness, fear, anger, and sadness but also to pose the respective expressions while facial muscle activity was measured using EMG from the zygomaticus, corrugator, orbicularis oris, and orbicularis oculi sites. Their results showed that these emotion categories were differentiated from each other in valence based on facial EMG patterns across muscles for some, but not all, participants. But multiple emotions can be experienced during imagery, and there is signifcant inter-individual variability in displaying posed emotional expressions, both of which pose important limitations for this methodological approach.

Studies presented thus far involve participants imagining emotional situations and measuring aspects of their resultant emotional experience. However, facial reactions can also be measured as a participant's affective response to visual or auditory affective stimuli. For example, Larsen et al. (2003) presented participants with pictures, sounds, and words of positive and negative affective content and measured the zygomaticus and corrugator facial muscle sites while participants reported their affective states. A relationship was found between self-reported valence ratings and facial EMG activity. Positive valence ratings were associated with activity in the zygomaticus muscle site and negative valence ratings with corrugator site activity. Facial reactions to emotional stimuli can also be assessed using EMG. Dimberg (1988) presented happy and angry facial expressions to participants and measured corrugator and zygomaticus site activity as well as heart rate. Increased corrugator site activity, heart rate deceleration, and more subjective experiences of fear were found in response to angry stimuli compared to happy stimuli. Conversely, increased zygomatic site activity and more subjective experiences of happiness were found in response to happy stimuli. A wide range of stimuli types with varying intensities can be used in research on the experiences and expression of affect and emotion and responses measured with facial EMG.

#### **Investigating Emotion-Related Processes Using Facial EMG**

The sensitivity of facial EMG in detecting facial muscle activity is of particular importance when examining phenomena that are diffcult to observe with other approaches. For example, consider the investigation of covert facial mimicry. When we see a facial emotion expression, it is very likely that the muscles in our own face will become subtly activated in a manner that matches the observed expression. This phenomenon is commonly termed 'facial mimicry' and was frst reported by Dimberg (1982). He investigated facial EMG from the zygomaticus and corrugator muscle sites while participants observed pictures of facial emotional expressions of anger and happiness. The results showed greater zygomaticus site activity in response to happiness than anger expressions and greater corrugator site activity in response to anger than happiness expressions. This phenomenon has since been replicated numerous times from the zygomaticus and corrugator muscle sites (for a review, see Hess & Fischer, 2013). These authors also list facial EMG studies where additional muscles were investigated in facial mimicry. For example, the levator labii muscle site has been reported to respond to observing facial expressions of disgust (Lundqvist, 1995; Lundqvist & Dimberg, 1995; Murata et al., 2016; Oberman et al., 2007; Rymarczyk et al., 2016) and the lateralis frontalis muscle site to expressions of fear (Lundqvist, 1995; Rymarczyk et al., 2016) and surprise (Lundqvist, 1995; Lundqvist & Dimberg, 1995; Murata et al., 2016). Nonetheless, the evidence is rather limited for matched facial muscle activation in observers for muscle sites other than the zygomaticus or corrugator.

Generally, studies on facial mimicry listed above investigated emotion-specifc facial muscle activation in individual facial muscle sites for multiple emotion categories. As described earlier, some facial muscles are involved in the expression of various emotions (see Table 17.1). The corrugator muscle constitutes a prime example—it is involved in many expressions of negative affect and emotion. Thus, a different approach to showing differential facial muscle activation related to facial mimicry would be to investigate facial EMG across several muscles and consider the emerging activation patterns per emotion category, similar to the approach taken by Fridlund et al. (1984). Wingenbach et al. (2020) measured facial EMG from the corrugator, zygomaticus, depressor, levator, and frontalis facial muscle sites while participants watched dynamic facial expressions of the six basic emotions as well as the more complex emotions of contempt, pride, embarrassment, and neutral facial expressions (i.e. blank stares). The expected activation per muscle site based on previous work on facial emotional expressions was prespecifed (as contrast coeffcients) for each emotion category and treated as patterns (see https://www.nature. com/articles/s41598-020-61563-5/tables/1 Table 1 in Wingenbach et al., 2020). The measured EMG data across facial muscle sites per emotion category were compared to the theory-based expected patterns to investigate facial mimicry per emotion category. The measured EMG pattern of each emotion category with its expected pattern was also contrasted to expected patterns of emotion categories of the same valence category (positive, neutral, and negative) to test for distinctiveness. Results showed that the measured EMG data matched the expected patterns for most tested emotions. Additionally, the measured EMG patterns for individual emotion categories were distinct within their own valence category for most tested emotions (see Figure 3 in Wingenbach et al., 2020). That is, the measured EMG data better ft the expected patterns of the target emotions than the expected patterns of non-target emotions of the same valence. These fndings suggest that facial mimicry is a categorical mirroring of the observed facial emotional expression.

As many studies have now demonstrated, facial EMG can be a useful tool for emotion-specifc investigations. Moreover, facial EMG can also be used to investigate variations in facial expressions within an emotion category. For example, research has shown that subtle variations in kinds of smiles are mimicked by observers (Korb et al., 2014; Krumhuber et al., 2014). These researchers recorded facial muscle activity from the corrugator, orbicularis oculi, and zygomaticus sites while participants viewed dynamic displays of various smiles operationalised as variations of AU combinations. These variations are possible because facial expressions of emotion can be posed volitionally, and such posed expressions often differ from spontaneous felt expressions in terms of included AUs. Moreover, judges can reliably discriminate between posed and felt facial expressions (e.g. McLellan et al., 2010). Results from Korb et al. (2014) showed that the recorded EMG activity corresponded with the AUs displayed in the stimuli, demonstrating feature-specifc mimicry, similar to the results by Wingenbach et al. (2020). Such fndings of specifcity in facial muscle activation, in line with the observed stimulus, hint at facial mimicry being a mirroring of the stimulus content rather than an affective reaction to the stimulus, although more research is needed to examine this issue.

Facial EMG can also be used to differentiate between participants' felt and posed facial expressions of emotion. This differentiation is based on divergent temporal characteristics in posed and spontaneous facial expressions (Ekman & Friesen, 1982). For example, spontaneous smiles have a longer duration than posed smiles (Schmidt et al., 2006). Hess et al. (1988) instructed participants to pose or feel happiness and measured facial muscle activation across the zygomaticus, depressor anguli oris, corrugator, and masseter muscle sites. Temporal aspects of the facial EMG measurements (i.e. time mean, time variance, time skewness, and time kurtosis; Cacioppo et al., 1983) distinguished between posed and felt smiles. Such research fndings demonstrate that facial EMG is a useful tool in assessing not only participants' different expressions across elicitation conditions but also their defning characteristics.

Based on EMG's high temporal frequency, it is further possible to identify the onset and offset of an expression and to illustrate the development of an expression (e.g. identifying the peak). Achaibou et al. (2008) segmented the recorded signal of the facial muscle activity in the zygomaticus and the corrugator in response to observing expressions of happiness and anger (i.e. a facial mimicry paradigm) in 100 ms epochs. Facial muscle response onsets were defned by comparing the mean facial muscle activity per epoch in response to happy and angry facial emotional expressions to one another (per muscle). The onset of corrugator activity in response to observing angry facial expressions was found at 200 ms after stimulus onset and 500 ms after stimulus onset for happy facial expressions in the zygomaticus. These fndings suggest that the corrugator is activated more quickly. Angry expressions might be processed more rapidly than happy expressions which could serve an evolutionary adaptive function. It is further possible that the corrugator is involved in the (stimulus-unspecifc) orienting response (Dimberg, 1982) preceding the mimicry response. Moreover, since morphed dynamic stimuli were used in this study, which create artifcial facial movements, it remains to be seen whether these timing differences also occur when participants view video-recorded facial emotional expressions including the natural temporal characteristics of the facial emotional expressions. The investigation of the onsets of facial muscle activity when participants observe static facial emotional expressions has not shown differing onsets in the EMG signal in response to the stimuli (Dimberg & Thunberg, 1998).

We have now seen application possibilities of facial EMG to assess the experience of affect and emotion, posed expressions of emotion, and responses related to the processing of stimuli of emotional content (e.g. facial expressions, words, sounds). Another application possibility is using facial EMG as a manipulation check. Some investigations include the manipulation of facial muscle activation in participants, and facial EMG can demonstrate the success of the manipulation. Examples of the manipulation of facial muscle activation are biting on a pen or holding a pen with the lips (e.g. Oberman et al., 2007; Wingenbach et al., 2018) or imitating observed facial expressions (e.g. Wingenbach et al., 2018). In Wingenbach et al. (2018), participants solved a facial emotion recognition task across two conditions with manipulated facial muscle activation, i.e. explicit imitation and pen in the mouth, next to a control condition with no manipulation, while fve different facial muscle sites were measured across the face. Participants showed increased activity (compared to the control condition) in all fve facial muscle sites in the explicit imitation condition. The pen-holding condition showed the highest activity in the electrodes placed below the left mouth corner (see Figure 2 in Wingenbach et al., 2018). The measured facial muscle activity thus showed a pattern as was intended by the manipulations, and the facial EMG results served to verify the method. The study further showed that an incongruence between visual input (facial emotional expression in the stimuli) and motor action (activity induced under the mouth corner from pen-holding) hampered the recognition of facial emotional expressions with feature saliency in the lower part of the face/mouth region (here, disgust, happiness, embarrassment, contempt, and pride) based on accuracy rates. Emotional expressions with feature saliency in the lower part of the face all include lip movement either outwards or upwards, which is inhibited by the pressing of the lips induced by the pen-holding. Judges' lowered recognition rates might be due to a confict between facial muscle movement observed in the stimuli and muscular feedback to the brain, which might also be part of a representation of the observed emotion (for more information on embodiment on emotion, see Niedenthal, 2007). Thus, not only can facial EMG serve as a means to verify applied facial muscle manipulations, facial EMG results can also inform interpretation of obtained behavioural results (e.g. recognition rates), and new theoretical insights might be gained.

#### **Challenges of Using Facial EMG and How to Overcome Them**

In summary, this chapter highlighted the many strengths of facial EMG and some possible applications in research. Its most notable advantages are (1) increased objectivity relative to self-reports, (2) high sensitivity in detecting small muscle activations, and (3) high temporal frequency allowing for the assessment of rapidly changing activations characteristic of facial expressions. Nonetheless, facial EMG also comes with challenges. It is well-known that awareness about the purpose of a measure or the hypotheses of a study can alter participants' behaviour. To avoid potential infuences on the obtained EMG data, it is custom to keep participants blind to the true purpose of the electrodes. This can be achieved by using a cover story in the instructions provided to participants, such as the electrodes measure temperature in various parts of the face. It is also possible that participants alter their natural facial behaviour simply because they have electrodes attached to their face. Some participants report during attachment that they are afraid the electrodes would come off, and others report that they feel restricted in their movements. These challenges can be overcome by ensuring proper electrode attachment (e.g. thorough cleaning of the skin) and asking participants to make grimaces to demonstrate secure electrode attachment. Generally, participants habituate to the electrodes quickly and do not actively feel them anymore. Acceptance of having electrodes attached in the face is generally high in participants, as participants do not perceive the electrodes as disturbing or restricting (Wingenbach, 2010).

Facial muscles are rather small, and the guidelines for electrode placement must thus be carefully followed. When misplacing an electrode by just 1 cm, it is already likely that non-targeted muscles are being recorded. While assessing facial muscle activity from multiple facial sites has numerous advantages, one should be aware of the potential for crosstalk between EMG sites. Researchers should make sure to have suffcient distance between electrode pairs; the smaller the electrodes, the better. Since facial EMG electrodes measure electricity, they are affected by ambient

electromagnetic felds creating noise in the data, which can be minimised by collecting data within Faraday cages. It is further recommended to use shielded electrodes, keep electrical devices in the laboratory to a bare minimum, and use a notch flter on the recorded signal. Moreover, further fltering of the EMG data is necessary (e.g. high pass, low pass, moving average), and spike artefacts should be eliminated (e.g. see Wingenbach et al., 2020). Movement artefacts are common during EMG recordings, including sneezing, coughing, scratching, and yawning. Since these artefacts cannot easily be separated from the rest of the signal based on visual inspection, it is recommended to observe participants via camera, take notes including exact timing, and exclude those segments from data analysis. Every face is anatomically different which also includes variations in fatty tissue and muscle size. As a consequence, the recorded strength of the EMG signal has high inter-individual variability in addition to variability in responsiveness per se. To tackle this challenge, normalisation of the EMG data per participant is recommended (e.g. see Wingenbach et al., 2020).

Many investigations using facial EMG opt to *z*-standardise each participant's data before entering it into analyses. This is then done for each measured facial muscle site across all experimental conditions but individually per participant. While this is indeed a legitimate approach to make the data comparable between participants, researchers are urged to consider the implications of *z*-standardisation for their results and whether the posed research question can be answered with *z*-standardised data. For example, should researchers wish to investigate whether there was an increase in facial muscle activity in response to a stimulus, then *z*standardisation should not be done. *Z*-standardisation scales the mean activity from one channel (i.e. facial muscle site) across all trials to zero. Resulting positive *z*values are thus to be interpreted as higher than average in response to a specifc stimulus and negative *z*-values as lower than average. Care must thus be taken when interpreting these kinds of results. This problem is exemplifed in a recent study by Wingenbach et al. (2020). The corrugator facial muscle site did not show an increase in activity in response to anger facial expression stimuli after a prestimulus baseline correction based on the non-standardised data. However, after *z*-standardisation of the corrugator site, positive *z*-values were obtained in response to anger facial expression stimuli (compare the third to fourth column in Fig. 17.1). That is, the corrugator site showed higher than average activity in response to anger facial expressions than to other stimulus categories included in the task. But the resulting positive and negative *z-*values did not represent an increase or decrease in activity, respectively, as was demonstrated by the non-standardised data, which in fact showed a decrease in activity in response to anger facial expressions. An alternative to *z*-standardisation is provided by range correction, which does not alter the interpretation of the results. That is, after prestimulus baseline correction, positive values represent an increase in activity in response to a stimulus, and negative values represent a decrease.


#### **Fig. 17.1** Facial muscle responses to facial expression of anger

*Note*. This fgure is a composite of Figures 2 and 4 in Wingenbach et al. (2020). The frst column shows the fve measured facial muscle sites and the second column the expected facial muscle responses when participants viewed angry facial expressions. Blue bars indicate an (expected) increase compared to a prestimulus baseline, and gold bars indicate an (expected) decrease. The third column shows the measured facial muscle responses to angry faces; the EMG data were range-corrected, and no increase in activity occurred in the corrugator. The fourth column shows the *z*-standardised means with positive *z*-values for the corrugator, which are in fact based on a decrease in corrugator activity in response to angry faces

#### **Conclusion**

Facial EMG is a sophisticated measurement tool that allows researchers to uncover subtle emotional components and thus deepen our understanding of emotion-related phenomena that occur in face-to-face social interaction (e.g. facial mimicry). It can also add to our knowledge of the experience and expression of emotions through faces, such as fne-grained temporal characteristics of facial emotional signalling. Based on sociocultural norms, people sometimes suppress their emotional feelings and experiences, which can include suppressing the associated facial expression. Otherwise, not easily observable facial EMG can provide information about the presence of a suppressed emotion. This provides researchers with a nice alternative to self-report, which are subjective in nature and require introspective abilities that vary across individuals, and is subject to a host of biases and normative constraints. Facial EMG further allows us to differentiate authentically felt emotion from posed affect/emotion and can uncover phenomena that we would not otherwise be aware of (e.g. facial mimicry). Overall, facial EMG is a valuable tool that is expanding our current knowledge on phenomena and processes associated with and underlying affect and emotion.

**Acknowledgement** I would like to thank Greg Bryant for procrastinating on his work by reviewing and proofng this chapter.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Index**

#### **A**

Active inference, 122, 130, 131, 136 Adolescence and childhood, 162–165, 167, 170 Aesthetic appreciation, 53, 57, 59 Aesthetic stimuli, 54–56, 58, 59 Amygdala, 9, 12–14, 24, 26, 29–34, 56, 66, 93, 97, 99, 100, 110, 127, 223–225, 239, 265, 284 Audio visuomotor neurons, 75

**B**

Brain imaging methods, 214, 218

**C** Cognitive ethology, 275, 276

#### **D**

Default-mode network, 58, 147 Dopamine system, 7, 8, 10, 12, 15

#### **E**

EEG and ERP, 5, 24, 31–33, 44–47, 66–68, 70–73, 76, 80, 87, 89, 94, 97–99, 101, 149, 154, 166, 186, 195, 196, 198–200, 202, 204–207, 209, 214, 215, 224, 226, 234–236, 247, 262 Electromyography, 44, 57, 283, 287 Electrophysiology of language, 196, 197

Embodiment, 37–48, 294 Emergence, 101, 122, 136, 215 Emotion/affective decoding, 223–224 Emotional processing, 4, 13, 14, 23, 24, 26–29, 31–34, 56, 108–111, 224, 265 Emotional recognition, 162, 167 Emotional states, 11, 34, 44, 47, 54, 55, 70, 88, 90, 171, 238, 275, 285, 288 Emotion embodiment, 38, 43–48, 294 Empathy, 40, 42, 43, 47, 65, 70, 82, 86, 92, 98–102, 113, 132–134, 162, 163, 170, 238–240, 245, 258, 261 Eye tracking, 246, 247, 272, 274–277

#### **F**

Face *pareidolia*, 98 Facial EMG, 283, 287–296 Facial emotional expressions, 44, 46, 283, 286, 287, 290–293 Facial expressions, 14, 24–27, 29, 31–33, 40, 43–46, 48, 70, 86, 88–96, 98, 102, 163, 166, 262–264, 273, 274, 284–296 Facial muscles, 44, 46, 56, 285–296 Functional magnetic resonance imaging (fMRI), 4, 5, 9, 12–15, 24, 28–31, 33, 43, 56–58, 65–67, 69, 74, 76, 77, 79, 90, 97, 147, 154, 178, 179, 181, 186, 207, 214, 215, 224, 225, 232–237, 239, 240, 242–243, 246, 247 Functional near infrared spectroscopy (fNIRS), 186, 214, 215, 224, 226, 232–240, 244–247

© The Editor(s) (if applicable) and The Author(s) 2023 P. S. Boggio et al. (eds.), *Social and Affective Neuroscience of Everyday Human Interaction*, https://doi.org/10.1007/978-3-031-08651-9

#### **H**

Halo effect, 58 Hemispheric asymmetries, 68, 69, 87 Homeostasis, 3, 38, 121, 122, 127, 128, 132 Human emotions, 10, 14, 224 Hyperscanning, 184, 186, 187, 225, 232, 235–240, 242–244, 246, 247

#### **M**

Machine learning, 216 Mental improvisation, 146, 148, 149, 151, 152 Mental navigation, 146, 148–152 Mind wandering, 58, 146–154 Mirror neuron system, 39, 40, 65, 67, 96, 238–240, 245 Molecular imaging, 6, 11, 13 Moral psychology, 108 Morality, 107–109, 111–113, 167

#### **N**

Neurofeedback, 214, 224–226, 246 Neuropsychiatry, 179, 184 Neuroscience machine learning, 214–227

#### **O**

Opioid system, 9, 11–14, 182 Orthographic analysis, 197–199

#### **P**

Parental response, 86, 90, 93, 94 Perceptual decoupling, 146, 148–150, 152 Psychiatric disorders, 162, 169, 178–180, 182–187, 214, 223 Pupillometry, 275

#### **R**

Racial bias, 42, 43, 47, 210

#### **S**

Second-person neuroscience, 183, 184 Serotonin system, 14, 15 Sex hormones, 100–102 Social brain, 86, 184, 226, 232, 238, 239, 243, 244, 246, 247 Social embodiment, 40–43, 48 Social interaction, 12, 34, 42, 86, 102, 120, 121, 129, 130, 134–136, 162–164, 168, 169, 171, 178, 179, 183–187, 204, 225, 226, 238, 240, 242, 243, 245–247, 274–278, 296 Stereotypes and prejudices, 43, 48, 131, 200, 204–210, 258–259, 266 Subcortical visual pathway, 24, 26, 31, 34

#### **T**

Theory of mind, 66, 69, 152, 153, 162, 164, 165, 204, 238–240 Transcranial Direct Current Stimulation (tDCS), 256–259, 261–266 Transcranial Magnetic Stimulation (TMS), 79, 226, 256–260, 262, 263, 265, 266 Trust, 92, 102, 120–137, 259

#### **U**

Unconscious emotional responses, 26–29, 33

#### **W**

Whole brain coverage, 247